LAAC-LSCP / ChildProject

Python package for the management of day-long recordings of children.
https://childproject.readthedocs.io
MIT License
13 stars 5 forks source link

Energy detection based sampler #130

Closed lucasgautheron closed 3 years ago

lucasgautheron commented 3 years ago

Description

Progress

Associated issues

109

MarvinLvn commented 3 years ago

About the multi-channel discussion :

MarvinLvn commented 3 years ago

combining several recorders : Let be x_i and y_i the fractions of windows having a higher energy than the window i, for the recorder x and the recorder y respectively. There are several possibilities: 1) Check the agreement between devices ? (i.e. veto windows that do not pass the threshold in one of the devices). This leads to the condition (1-x_i > 1-q) and (1-y_i > 1-q). 2) Find a way to combine the energy quantiles of each (window, recorder). These can be computed independently w/o any difficulty for the recorders x and y. By definition, P(x_i < x) = x and P(y_i < y) = y (they follow U(0,1)). It follows that P(x_i > x) = 1-x and P(y_i > y) = 1-y. Obviously, x_i and y_i are expected to be strongly correlated... But if we assume they are not, we might be able to combine these p-values in some way idk... 3) We can also average energies or something, but that sounds really wrong... unless we do some normalization first, which is cumbersome...Imo the agreement based approach seems more appropriate.

After your first quantile threshold, 3 possibilities : 1) No agreement at all amongst the 2 recorders : dead-end 2) You get agreement for some windows, but not enough of them 3) Full agreement

The situation where you'll end up will mainly depend on the quantity of data you have, the threshold you apply. In situation number 1) and 2), you could always decide to lower the quantile threshold until you get enough windows for recorder a) and recorder b) : not sure this is the ideal solution.

Solution that seems the easiest to implement is the one that consists in averaging energies. You can do something like : 1) Compute E_a, E_b the list of energies for recorder a) and recorder b) 2) Normalize E_a by : (E_a - mean(E_a)) / std(E_a), same for E_b 3) Go through your single-channel pipeline : apply quantile threshold, etc ...

This is just one more pre-processing step as compared to solution 1) and 2). Agreement-based solutions seem scary to me as they are too data-dependent, and you'll always find situations where you won't have enough agreements between the 2 recorders. Hence, why I'll "combine the 2 recorders" as soon as possible.

lucasgautheron commented 3 years ago

About the multi-channel discussion :

  • your approach seems fine to me.
  • I think it'd be nice to be able to run the pipeline only on one of the N (N=2 or N=4) channel. In some cases (it may be the case with the BabyLogger), we can put some prior on which mic' gets the most speech.

I suggest two different options:

  1. We provide an additional 'channel' option. If specified, only this channel is used to compute the energy
  2. We provide a 'channel_weights' option. It must be a list of length equal to the amount channels. Each channel energy is then weighted by some coefficient. One can turn off channels by setting their coefficient to 0.

I prefer the latter, though it is less efficient when the goal is just to turn off channels.

lucasgautheron commented 3 years ago

combining several recorders : Let be x_i and y_i the fractions of windows having a higher energy than the window i, for the recorder x and the recorder y respectively. There are several possibilities:

  1. Check the agreement between devices ? (i.e. veto windows that do not pass the threshold in one of the devices). This leads to the condition (1-x_i > 1-q) and (1-y_i > 1-q).
  2. Find a way to combine the energy quantiles of each (window, recorder). These can be computed independently w/o any difficulty for the recorders x and y. By definition, P(x_i < x) = x and P(y_i < y) = y (they follow U(0,1)). It follows that P(x_i > x) = 1-x and P(y_i > y) = 1-y. Obviously, x_i and y_i are expected to be strongly correlated... But if we assume they are not, we might be able to combine these p-values in some way idk...
  3. We can also average energies or something, but that sounds really wrong... unless we do some normalization first, which is cumbersome...Imo the agreement based approach seems more appropriate.

After your first quantile threshold, 3 possibilities :

  1. No agreement at all amongst the 2 recorders : dead-end
  2. You get agreement for some windows, but not enough of them
  3. Full agreement

The situation where you'll end up will mainly depend on the quantity of data you have, the threshold you apply. In situation number 1) and 2), you could always decide to lower the quantile threshold until you get enough windows for recorder a) and recorder b) : not sure this is the ideal solution.

Solution that seems the easiest to implement is the one that consists in averaging energies. You can do something like :

  1. Compute E_a, E_b the list of energies for recorder a) and recorder b)
  2. Normalize E_a by : (E_a - mean(E_a)) / std(E_a), same for E_b
  3. Go through your single-channel pipeline : apply quantile threshold, etc ...

This is just one more pre-processing step as compared to solution 1) and 2). Agreement-based solutions seem scary to me as they are too data-dependent, and you'll always find situations where you won't have enough agreements between the 2 recorders. Hence, why I'll "combine the 2 recorders" as soon as possible.

I see.. Though, I am more optimistic than you. I would expect high correlations between the two recorders. Besides the distribution of energies is very flat. Here is the distribution for 600 30s windows drawn from a 20h USB recording. The energy is spanned across 5 orders of magnitude.

Screenshot 2021-02-17 at 11 58 17

In the end, it is not very difficult to switch from one way or another once we have the core pipeline ready. So from now on, i'll implement the method you suggest, but the recorder-agreement thing will lie outside of the package, because that seems way too specific - though it will use the package to compute the energies.

There is just one thing I need to think about : whether this z-score normalization really fits. I might be overthinking this...