jcvasquezc / DisVoice

feature extraction from speech signals
https://disvoice.readthedocs.io/en/latest/
MIT License
344 stars 77 forks source link

Preprocessing before feature extraction #22

Closed bwang482 closed 3 years ago

bwang482 commented 3 years ago

Hi @jcvasquezc thanks again for the great lib!

I am just wondering if I should perform any data preprocessing before feeding the audio to extract_features_file. My audio files are utterances (> 2 secs) mostly one per speaker (sometimes one contains a second speaker saying "yes" or "um") but there's loudness difference in the utterances between the two speakers. Do you suggest I scale the audio waveforms to (-1, +1), save the audio files, and then feed them to the feature extactors?

The down-stream task is classification so I didn't want to complicate it by performing more advanced preprocessing. minmax scaling seems sufficient enough do you think so?

issue-label-bot[bot] commented 3 years ago

Issue Label Bot is not confident enough to auto-label this issue. See dashboard for more details.

jcvasquezc commented 3 years ago

Hi @bluemonk482

The audio level normalization between -1, 1 it is already performed before the feature extraction, thus you dont need to do it

https://github.com/jcvasquezc/DisVoice/blob/2d8b48ebd667b78e65ec48eb667e64febcfcfe44/phonation/phonation.py#L166

If the energy of the second speaker that says "yes" or "um" is high enough to hear it or to overlap with the speech of the person, I recommend you to cut if before, otherwise, you can just extract the features from the speech files as they are

bwang482 commented 3 years ago

Thanks very much @jcvasquezc !

Can I please also confirm with you the phonological feature extractor was trained on Spanish, not English??

jcvasquezc commented 3 years ago

Yes, the phonological feature extractor was trained in Spanish data

bwang482 commented 3 years ago

Thanks @jcvasquezc ! Would it be possible to obtain a model trained in English? ...

jcvasquezc commented 3 years ago

Yes, I hope that the following update has models for English and German

bwang482 commented 3 years ago

Thanks a lot @jcvasquezc !

Look forward to the update! Hopefully soon 👍