alexis-michaud / egg

electroglottography
GNU Lesser General Public License v3.0
3 stars 2 forks source link

Dealing with 'hill pattern' glottal cycles inside a fully voiced rhyme #9

Closed MinhChauNGUYEN closed 5 years ago

MinhChauNGUYEN commented 5 years ago

image Peakdet can't detect the five glottal cycles in the middle of a rhyme where the intensity of voicing decreases abruptly, as in the above figure. The five small cycles are all contained inside the 7th detected cycle. This leads to a spurious f0 value, as if there were no succession of glottis openings and closures in the whole interval. The reason is that as can be seen on the DEGG signal, there are no precise peaks for these small cycles. The oscillations take place between two clear closing peaks. They have a 'hill pattern', like the one found at soft offset of voicing (see issue #1 ). Method_PeakDet_PetitPeaks_M14_1541_DEGG As a consequence, the f0 curve is unreasonable with a spurious f0 value which is extremely low (under 30 Hz) whereas all other values are above 120 Hz. If one only looked at the results of peakdet, without referring to the audio, EGG and DEGG signals, this could easily be mistaken for a case of glottalization. Method_PeakDet_PetitPeaks_M14_1541_f0Oq

How to deal with this? Here is a suggestion for a solution.

It is not a signal error (artefact): in view of the audio, it seems clear that something really happens in the larynx such as a change of phonation mechanism (from m1 to m2: 'chest voice' to 'head voice'), maybe also with some movement of the whole larynx. Although there are no precise peaks for both closing and opening, the electroglottograph still captures this phenomenon: four small oscillations, which probably correspond to five glottal cycles. Therefore, PeakDet should implement a solution to add cycles where necessary. The user will do it roughly by looking at the signal.

PS Here is a link to the signals for this example: https://www.dropbox.com/s/yctplk9u9yrzq33/Method_PeakDet_PetitPeaks_M14_1541_audio.wav?dl=0

alexis-michaud commented 5 years ago

Interesting situation. What may be happening here? Rapid larynx movement, with a transition from phonation mechanism m1 to phonation mechanism m2? Rapid decrease in the amplitude of the modulation in vocal fold contact area could lead to losing the signal. It is reminiscent of signal loss (or sudden reduction in signal amplitude) in rising tones in Vietnamese data (D1, sắc) by male speakers.

Looking at the audio signal, it is clear that this is not just an artefact of EGG recording, since the audio signal, too, has much reduced amplitude at that point.

The functionality has now been added to peakdet_inter. It should be self-explanatory.

disp('If some of the glottal cycles went undetected, or extra cycles were erroneously detected:')
disp('- enter 1 (one) to change the settings for automatic detection, or')
disp('- enter 2 to make changes manually (by visual detection of cycles not detected by the script).')

The f0 value for each cycle is simply the f0 of the spurious cycle (cycle 7 in your example) multiplied by the number of cycles detected visually by the user (in your example: five cycles).

This functionality is to be used carefully, of course.

The added cycles can be detected automatically because they only contain 3 values: the first 3 columns (beginning of cycle, end of cycle, and f0): the rest of the columns contain zeros.

Please test & send feedback!

MinhChauNGUYEN commented 5 years ago

Here is the output after I applied your latest version. It detected 5 cycles but put them at the same values: 150Hz. Method_PeakDet_PetitPeaks_M14_1541_f0_af2

I don't know whether it's better than I changed manually the f0 values of cycle 7, as following. Method_PeakDet_PetitPeaks_M14_1541_f0_af

alexis-michaud commented 5 years ago

A difficulty with treating cycle 7 as one cycle (just changing its f0 to a 'reasonable' rule-of-thumb value) is that the time codes for the cycles in the results file will contain a gap: the duration of cycle 7 will not match the f0. So the image of the syllable rhyme that is provided in the results file will be pretty different from what listeners hear.

Intuitively, having as many cycles in the results file as you can detect by eye is the intended result, and it makes sense (in a way that is to be investigated further, and explained carefully, of course). One way to double-check that this 'hackaround' matches the data (noting that the extra cycles are not on a par with the other cycles: this has to be taken into account, of course) is to look at the electroglottographic signal as divided into cycles through the 'hackaround'.

Using your example: here is what the division into cycles looks like if cycle 7 is split into 5 petit cycles (I like your label 'PetitPeaks' for this issue!) image and here is the corresponding dEGG signal, with the same time marks: image

To me, this looks OK, and is better than manually setting the f0 of cycle 7 to five times the originally detected f0. The artificially flat sequence of cycles with identical f0 does not look 'natural', but keeping in mind that this method is a 'patch', and that the small cycles should be treated as different from the others in very important qualitative respects, there is no obvious benefit in trying further refinements to obtain a smoother curve. (Of course, if there are principled reasons to do further refinements, that would be good to know, and worth investigating.)

alexis-michaud commented 5 years ago

@MinhChauNGUYEN If the current version of Peakdet (i.e. after this change) works OK, please let me know so I can bump versions (making the current version into v1.0.4) and close this issue.

No hurry: feedback within 2 weeks (& bug reports if sthg is not working) would be fine.

MinhChauNGUYEN commented 5 years ago

It works well :thumbsup: Thanks.

alexis-michaud commented 5 years ago

OK, good! v1.0.4 it is, then.