introlab / odas_web

A desktop visualization GUI for the ODAS library
MIT License
137 stars 53 forks source link

Record the 4 sources #59

Open peisong0109 opened 3 years ago

peisong0109 commented 3 years ago

I'm running ODAS on a Raspberry Pi with Respeaker 4-mic array and running ODAS Web on a remote laptop.

We want to do some research work based on the sound directions information captured by 4 microphones.

So is it possible for us to record and save these information?

GodCed commented 3 years ago

Yes, you can configure ODAS to save tracked sources to a JSON file. You'll want to modify the sst section of your ODAS config file.

Under tracked -> format: json. Under tracked.interface -> type: file, filename: whatever.json.

peisong0109 commented 3 years ago

Thank you so much. @GodCed I got the tracked sources successfully. I know x,y,z is the coordinate of the sound source, while what's the meaning of "id" and "activity"?

1615780233(1)

GodCed commented 3 years ago

Just to clarify, (x,y,z) is not exactly the coordinates of the source. Those are the components of a unit vector pointing from the microphone array toward the sound source. A single microphone array can only provide de direction of arrival of sound, rather than the location.

id is used to differentiate between tracked sources. The way ODAS works is it instantiate a tracker when a bunch of sound energy comes from the same direction. This tracker gets an id. If the energy faints long enough, the tracker gets destroy. The next bunch of energy gets a new tracker with another id (the ids increment). The default config allow for up to 4 tracker at the same time, each with their own id.

Example scenario: two speakers talk in a room. You get source 1 and 2. A third speaker enters the room, it gets id 3. So now you have sources 1,2 and 3. First speaker stops for a while, you have sources 2 and 3. First speaker resumes, you have source 2 (speaker 2), 3 (speaker 3) and 4 (speaker 1 that resumed).

Activity is an indicator from 0 to 1 computed from the sound energy coming from this source. It resembles a confidence level, where 0 means this direction is certainly not a source and 1 is surely a source. If the activity drops below a certain point the tracker gets destroyed, and it must reach a certain value before a tracker is instantiated.