LAAC-LSCP / ChildProject

Python package for the management of day-long recordings of children.
https://childproject.readthedocs.io
MIT License
13 stars 5 forks source link

Multichannel support #92

Closed lucasgautheron closed 3 years ago

lucasgautheron commented 3 years ago

This will be needed in order to handle the BabyLogger audio We need to decide how to make it work.

Here are the functionalities that are affected:

One way would be to have one profile for each channel, and add the channel as an option of the ConversionPipeline.

alecristia commented 3 years ago

good point -- also, please note that we do not yet know how to integrate annotations (made by humans or machines), which could emerge from one of the channels or a combination of the channels. (for instance, perhaps humans annotate best when they hear binaurally the "front" + one other channel)

lucasgautheron commented 3 years ago

We are already facing this problem with Marvin's pilot. More discussion with the BabyLogger team is needed, but I have proposed the following solution for Marvin's classification task : for each sampled 30s window, we retain the channel with the highest energy. If the mics are directional, this might be a good way to maxime the signal/noise ratio, because we expect the highest energy to be achieved by the channel directed towards the speaker. Of course there might be better combinations, but this is the least arbitrary imo. However, in case of conversations, it might extinguish one of the speakers. What do you think ?