Closed lucasgautheron closed 3 years ago
About the multi-channel discussion :
combining several recorders : Let be x_i and y_i the fractions of windows having a higher energy than the window i, for the recorder x and the recorder y respectively. There are several possibilities: 1) Check the agreement between devices ? (i.e. veto windows that do not pass the threshold in one of the devices). This leads to the condition (1-x_i > 1-q) and (1-y_i > 1-q). 2) Find a way to combine the energy quantiles of each (window, recorder). These can be computed independently w/o any difficulty for the recorders x and y. By definition, P(x_i < x) = x and P(y_i < y) = y (they follow U(0,1)). It follows that P(x_i > x) = 1-x and P(y_i > y) = 1-y. Obviously, x_i and y_i are expected to be strongly correlated... But if we assume they are not, we might be able to combine these p-values in some way idk... 3) We can also average energies or something, but that sounds really wrong... unless we do some normalization first, which is cumbersome...Imo the agreement based approach seems more appropriate.
After your first quantile threshold, 3 possibilities : 1) No agreement at all amongst the 2 recorders : dead-end 2) You get agreement for some windows, but not enough of them 3) Full agreement
The situation where you'll end up will mainly depend on the quantity of data you have, the threshold you apply. In situation number 1) and 2), you could always decide to lower the quantile threshold until you get enough windows for recorder a) and recorder b) : not sure this is the ideal solution.
Solution that seems the easiest to implement is the one that consists in averaging energies. You can do something like : 1) Compute E_a, E_b the list of energies for recorder a) and recorder b) 2) Normalize E_a by : (E_a - mean(E_a)) / std(E_a), same for E_b 3) Go through your single-channel pipeline : apply quantile threshold, etc ...
This is just one more pre-processing step as compared to solution 1) and 2). Agreement-based solutions seem scary to me as they are too data-dependent, and you'll always find situations where you won't have enough agreements between the 2 recorders. Hence, why I'll "combine the 2 recorders" as soon as possible.
About the multi-channel discussion :
- your approach seems fine to me.
- I think it'd be nice to be able to run the pipeline only on one of the N (N=2 or N=4) channel. In some cases (it may be the case with the BabyLogger), we can put some prior on which mic' gets the most speech.
I suggest two different options:
I prefer the latter, though it is less efficient when the goal is just to turn off channels.
combining several recorders : Let be x_i and y_i the fractions of windows having a higher energy than the window i, for the recorder x and the recorder y respectively. There are several possibilities:
- Check the agreement between devices ? (i.e. veto windows that do not pass the threshold in one of the devices). This leads to the condition (1-x_i > 1-q) and (1-y_i > 1-q).
- Find a way to combine the energy quantiles of each (window, recorder). These can be computed independently w/o any difficulty for the recorders x and y. By definition, P(x_i < x) = x and P(y_i < y) = y (they follow U(0,1)). It follows that P(x_i > x) = 1-x and P(y_i > y) = 1-y. Obviously, x_i and y_i are expected to be strongly correlated... But if we assume they are not, we might be able to combine these p-values in some way idk...
- We can also average energies or something, but that sounds really wrong... unless we do some normalization first, which is cumbersome...Imo the agreement based approach seems more appropriate.
After your first quantile threshold, 3 possibilities :
- No agreement at all amongst the 2 recorders : dead-end
- You get agreement for some windows, but not enough of them
- Full agreement
The situation where you'll end up will mainly depend on the quantity of data you have, the threshold you apply. In situation number 1) and 2), you could always decide to lower the quantile threshold until you get enough windows for recorder a) and recorder b) : not sure this is the ideal solution.
Solution that seems the easiest to implement is the one that consists in averaging energies. You can do something like :
- Compute E_a, E_b the list of energies for recorder a) and recorder b)
- Normalize E_a by : (E_a - mean(E_a)) / std(E_a), same for E_b
- Go through your single-channel pipeline : apply quantile threshold, etc ...
This is just one more pre-processing step as compared to solution 1) and 2). Agreement-based solutions seem scary to me as they are too data-dependent, and you'll always find situations where you won't have enough agreements between the 2 recorders. Hence, why I'll "combine the 2 recorders" as soon as possible.
I see.. Though, I am more optimistic than you. I would expect high correlations between the two recorders. Besides the distribution of energies is very flat. Here is the distribution for 600 30s windows drawn from a 20h USB recording. The energy is spanned across 5 orders of magnitude.
In the end, it is not very difficult to switch from one way or another once we have the core pipeline ready. So from now on, i'll implement the method you suggest, but the recorder-agreement thing will lie outside of the package, because that seems way too specific - though it will use the package to compute the energies.
There is just one thing I need to think about : whether this z-score normalization really fits. I might be overthinking this...
Description
Progress
with an optional lowpass filter. if the lowpass filter does not have any significant impact, we might as well just get rid of it, then we need no fft...opted for an optional band-pass filter.Handling saturation : this is especially important since targeting higher energies will favor saturated regions. We could discard regions where > 1 % of saturating signal. Though, what should we do in case there are > 1 channels ? discard the channels, and the whole segment if all channels are saturating ? or should we discard a segment when at least one of the channel is saturating ?Marvin: let's keep it in mind, BUT no treatment for nowx
percents of windows with the highest energy. What do you think ? Discussion here.Associated issues
109