mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.36k stars 3.97k forks source link

Can't run DeepSpeech in SageMaker with Persistent Environment #3470

Closed b-zhang93 closed 3 years ago

b-zhang93 commented 3 years ago

Hey all,

I have been using my own trained model for inference on google colab for a while now. However, I now need to move everything into a sagemaker notebook. I created a fresh persistent conda environment (miniconda env) with python 3.7.

Here are the steps I took:

  1. installed tensorflow 2.3.0 into the conda env
  2. installed deepspeech into the conda env with %pip install deepspeech
  3. when I ran the script (vad_transcriber) it would say no module called "deepspeech"
  4. So I installed deepspeech outside of the env with !pip install deepspeech
  5. Now when I run it, I see this

Pasting the error here also:

DEBUG:root:Transcribing audio file @ first_6.wav
DEBUG:root:Found Model: speech_model/tedlium_checkpoint.pbmm
DEBUG:root:Found scorer: speech_model/tedlium_model.scorer
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
2020-12-16 23:18:57.797780: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
DEBUG:root:Loaded model in 0.015s.
terminate called after throwing an instance of 'lm::FormatLoadException'
  what():  native_client/kenlm/lm/binary_format.cc:160 in void* lm::ngram::BinaryFormat::LoadBinary(std::size_t) threw FormatLoadException because `file_size != util::kBadSize && file_size < total_map'.

Binary file has size 14680064 but the headers say it should be at least 941209108

I am quite new to using deepspeech so if anyone has any insight that would be amazing.

lissyx commented 3 years ago

This is not a bug, please reach for support on Discourse.