Shahabks / myprosody

A Python library for measuring the acoustic features of speech (simultaneous speech, high entropy) compared to ones of native speech.
https://shahabks.github.io/myprosody/
MIT License
232 stars 63 forks source link

Definition of Fillers and Pauses #28

Open ajeevanshgtm7 opened 2 years ago

ajeevanshgtm7 commented 2 years ago

I couldn't find a specific definition for the fillers and pauses parameter in the pipeline or in the docs. Can anyone please explain how this feature is calculated/derived? Thanks!

Shahabks commented 2 years ago

The acoustic characteristics of filled pauses include duration, variation of F0 , the height of F0, variability in formants F1 through F3, and overall stability. Filled pauses, in contrast to other syllables, tend to have longer durations, show less F0-variation, have a lower F0, and less F1-F3 variability. Filled pauses tend to be long, stable syllables pronounced at a low pitch. Additionally, filled pauses are usually pronounced as a schwa . For American English the sound of the filled pause may be closer to a mid-open back unrounded vowel ([ʌ]). As far as salient pauses are concerned, an empirical study on human transcripts of the speech recordings showed silent pausing would have a threshold of 250 ms. @ajeevanshgtm7