SeanNaren / deepspeech.pytorch

Speech Recognition using DeepSpeech2.
MIT License
2.11k stars 620 forks source link

transcribe.py: Segmentation fault when loading Mozilla Deepspeech's language model #298

Closed zenogantner closed 4 years ago

zenogantner commented 6 years ago

python3 transcribe.py --model-path models/librispeech_pretrained.pth --audio-path my-recording.wav --decoder beam --lm-path ../DeepSpeech/data/lm/lm.binary

Segmentation fault (core dumped)
(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7fe921e04700 (LWP 12660) fst::ArcIterator<fst::VectorFst<fst::ArcTpl<fst::TropicalWeightTpl<float> >, fst::VectorState<fst::ArcTpl<fst::TropicalWeightTpl<float> >, std::allocator<fst::ArcTpl<fst::TropicalWeightTpl<float> > > > > >::Seek (this=<optimized out>, a=0) at /tmp/pip-8jivklgg-build/third_party/openfst-1.6.7/src/include/fst/vector-fst.h:619
  2    Thread 0x7fe918e02700 (LWP 12657) 0x00007fe8deb47b40 in ?? () from /usr/local/lib/python3.5/dist-packages/torch/lib/libgomp-c0d7b783.so.1
  3    Thread 0x7fe916601700 (LWP 12656) 0x00007fe8deb47b40 in ?? () from /usr/local/lib/python3.5/dist-packages/torch/lib/libgomp-c0d7b783.so.1
  4    Thread 0x7fe919603700 (LWP 12658) 0x00007fe8deb47b51 in ?? () from /usr/local/lib/python3.5/dist-packages/torch/lib/libgomp-c0d7b783.so.1
  5    Thread 0x7fe927eb3700 (LWP 12610) syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38

Mozilla Deepspeech also uses KenLM language models. What am I doing wrong? If the file formats are incompatible, the failure could be a bit more graceful.

Let me know if you need further info.

Are there any prepared language models that I could try out?

ageojo commented 6 years ago

https://github.com/SeanNaren/deepspeech.pytorch/issues/261

I ran into the same problem. Uppercasing all ngrams in the lm .arpa file and rebuilding the binary resolved this issue. Note: Do not upper case the \ characters.

miguelvr commented 6 years ago

just train your own LM with kenlm (very fast to train and relatively straightforward) with the training data, it should work then.

adryyandc commented 6 years ago

You can download pretrained ones from: http://www.openslr.org/11/

Or more general models: http://www.keithv.com/software/giga/

zenogantner commented 6 years ago

Coming back to the segfault: even if a wrong LM is provided, maybe the software should not segfault?

miguelvr commented 6 years ago

KenLM doesn't guarantee compatibility with old binaries. The segfault doesn't happen in this repo, but most likely in WarpCTC or KenLM.

If you train your own LM with LIbriSpeech it will work. I get the same segfaults as you with those files

ryanleary commented 6 years ago

@miguelvr is correct. Alternatively if you keep your LMs in ARPA format, you can always convert that to a kenlm binary format with matched versions.

miguelvr commented 6 years ago

@ryanleary I'm not too sure about that. Why does it get a seg fault when you load an arpa LM file then?

zenogantner commented 6 years ago

Would it make sense to link the language models from https://github.com/SeanNaren/deepspeech.pytorch/releases ? I guess this would be useful for others as well, not just me.

bjtommychen commented 6 years ago

FYI. if use Mozilla Deepspeech release 0.2.0's Language Model. the WER may reduced from 10.2 -> 7.0

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.