Shahabks / my-voice-analysis

My-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. It breaks utterances and detects syllable boundaries, fundamental frequency contours, and formants.
https://shahabks.github.io/my-voice-analysis/
MIT License
289 stars 90 forks source link

Pronunication Scoring #9

Open dpny518 opened 4 years ago

dpny518 commented 4 years ago

How is the pronunciation scored without text and alignment

Witt S.M and Young S.J [2000]; “Phone-level pronunciation scoring and assessment or interactive language learning”; Speech Communication, 30 (2000) 95-108.

requires the constrained phone loop

Shahabks commented 3 years ago

@yondu22 My-Voice-Analysis and MYprosody repos are two capsulated libraries from one of our main projects on speech scoring. The main project (its early version) employed ASR and used the Hidden Markov Model framework to train simple Gaussian acoustic models for each phoneme for each speaker in the given available audio datasets, then calculating all the symmetric K-L divergences for each pair of models for each speaker. What you see in these repos are just an approximate of those model without paying attention to level of accuracy of each phenome rather on fluency In the project's machine learning model we considered audio files of speakers who possessed an appropriate degree of pronunciation, either in general or for a specific utterance, word or phoneme, (in effect they had been rated with expert-human graders). Here below the figure illustrates some of the factors that the expert-human grader had considered in rating as an overall score

image

S. M. Witt, 2012 “Automatic error detection in pronunciation training: Where we are and where we need to go,”