Open fred2nice opened 7 years ago
If I understand you correctly it seems like you have two options. Neither really relies on any features that are built into pocketsphinx.
@fred2nice The way speech recognition works is to use in parallel an acoustic model (describing the sounds of phonemes) and a language model (describing which words can be recognized, and in which order).
You can not have viable results without a language model, the language model is not here to just "correct" the text.
If you want to use speech technology for pronunciation assessment, there are plenty of academic resources about it, I would suggest you to take a look. I can also recommend you to look at Ispikit (https://ispikit.com) which I built.
Hi, I am working on a pronunciation trainer. I would to write "raw" text without correction.
I want the student to see the word spoken and not the word that corresponds the most. Is it possible to directly write the audio to text ?
An example : If the student says "beers" I would like to print "beers" is not "bears" as result.