alphacep / vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Apache License 2.0
7.73k stars 1.08k forks source link

German model "vosk-model-de-tuda-0.6-900k" does not work #1208

Closed ralf3u closed 1 year ago

ralf3u commented 1 year ago

English works. French works with the model vosk-model-fr-0.22. German works with the model vosk-model-de-0.21.

But German does not work with the model vosk-model-de-tuda-0.6-900k, what I got from https://alphacephei.com/vosk/models. This is the command:

vosk-transcriber -m ~/Downloads/vosk-model-de-tuda-0.6-900k -i hello4.wav -o test1.1-de.txt

And this is the result:

LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=13 max-active=7000 lattice-beam=6
LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10
LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (VoskAPI:Collapse():nnet-utils.cc:1488) Added 1 components, removed 2
LOG (VoskAPI:ReadDataFiles():model.cc:248) Loading i-vector extractor from /home/t/Downloads/vosk-model-de-tuda-0.6-900k/ivector/final.ie
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (VoskAPI:ReadDataFiles():model.cc:279) Loading HCLG from /home/t/Downloads/vosk-model-de-tuda-0.6-900k/graph/HCLG.fst
LOG (VoskAPI:ReadDataFiles():model.cc:294) Loading words from /home/t/Downloads/vosk-model-de-tuda-0.6-900k/graph/words.txt
LOG (VoskAPI:ReadDataFiles():model.cc:303) Loading winfo /home/t/Downloads/vosk-model-de-tuda-0.6-900k/graph/phones/word_boundary.int
LOG (VoskAPI:ReadDataFiles():model.cc:310) Loading subtract G.fst model from /home/t/Downloads/vosk-model-de-tuda-0.6-900k/rescore/G.fst
LOG (VoskAPI:ReadDataFiles():model.cc:312) Loading CARPA model from /home/t/Downloads/vosk-model-de-tuda-0.6-900k/rescore/G.carpa
LOG (VoskAPI:ReadDataFiles():model.cc:318) Loading RNNLM model from /home/t/Downloads/vosk-model-de-tuda-0.6-900k/rnnlm/final.raw
Killed
nshmyrev commented 1 year ago

It goes out of memory and get killed by kernel. You probably need 16 Gb to run it.

ralf3u commented 1 year ago

It goes out of memory and get killed by kernel. You probably need 16 Gb to run it.

For an audio-file that is only 2 seconds long?

nshmyrev commented 1 year ago

Memory usage doesn't depend on the file length, the model is big by itself.

ralf3u commented 1 year ago

Memory usage doesn't depend on the file length, the model is big by itself.

How much memory you use?

nshmyrev commented 1 year ago

We usually use 128Gb servers

ralf3u commented 1 year ago

I use 8GB.

We usually use 128Gb servers

OK! And how much should a normal user use?

ralf3u commented 1 year ago

It goes out of memory and get killed by kernel. You probably need 16 Gb to run it.

You are right. I can see it in the process manager. In the moment where the swap is full, the process get killed. My swap is 500 MB by default. So I will increase the swap in the next days to see if that has an impact.

nshmyrev commented 1 year ago

Swap will not help. You can remove rnnlm folder from the model to make it more lightweight.

ralf3u commented 1 year ago

You can remove rnnlm folder from the model to make it more lightweight.

True! Thank you so much for your help. And the model respects capital letters!!! What is the disadvantage when removing rnnlm?

nshmyrev commented 1 year ago

Couple percent less accurate recognition

ralf3u commented 1 year ago

Swap will not help.

I can not confirm. I added again the folder rnnlm in the model, and I closed many windows. And then 8 GB RAM with 11 GB swap worked for me. I can see in the Process Manager the amount of RAM used, and how good it is to have a big swap for that work.