mozilla / DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.36k stars 3.97k forks source link

Error: Trie file version mismatch (4 instead of expected 3). Update your trie file. terminate called after throwing an instance of 'int' #2274

Closed vibhashahani closed 5 years ago

vibhashahani commented 5 years ago

Hi,

I am trying to run DeepSpeech on a small data set. I am using Deepspeech version = 0.5.1

Steps I followed:

  1. Cloned 0.5.1 version from github repository

  2. pip3 install -r requirements.txt

  3. python util/taskcluster.py --arch gpu --target native_client

Then for creating language model:

  1. git clone https://github.com/kpu/kenlm.git

  2. cd kenlm/

  3. Mkdir build

  4. Cd build

  5. Cmake ..

  6. Make -j 4

  7. vim alphabet.txt (containing all english alphabets)

  8. vim some.txt (corpus)

  9. ../kenlm/build/bin/lmplz -o 5 lm.arpa

  10. ../kenlm/build/bin/build_binary lm.arpa lm.binary

  11. ../DeepSpeech/native_client/generate_trie alphabet.txt lm.binary trie

Then ran the following script:

python -u DeepSpeech.py \ --train_files "/home/dev_ds/deepspeech_dir_2/corpus/corpus-train.csv" \ --dev_files "/home/dev_ds/deepspeech_dir_2/corpus/corpus-dev.csv" \ --test_files "/home/dev_ds/deepspeech_dir_2/corpus/corpus-test.csv" \ --alphabet_config_path "/home/dev_ds/deepspeech_dir_2/my-model/alphabet.txt" \ --lm_binary_path "/home/dev_ds/deepspeech_dir_2/my-model/lm.binary" \ --lm_trie_path "/home/dev_ds/deepspeech_dir_2/my-model/trie" \ --learning_rate 0.001 \ --dropout_rate 0.05 \ --word_count_weight 3.5 \ --log_level 1 \ --display_step 1 \ --epoch 75 \ --export_dir "/home/dev_ds/deepspeech_dir_2/my-model"

I am getting the following error:

I Restored variables from most recent checkpoint at /home/dev_ds/.local/share/deepspeech/checkpoints/train-6300, step 6300 I STARTING Optimization Epoch 0 | Training | Elapsed Time: 0:12:18 | Steps: 31 | Loss: 178.990274
Epoch 0 | Validation | Elapsed Time: 0:00:18 | Steps: 15 | Loss: 167.763600 | Dataset: /home/dev_ds/deepspeech_dir_2/corpus/corpus-dev.csv I Saved new best validating model with loss 167.763600 to: /home/dev_ds/.local/share/deepspeech/checkpoints/best_dev-6331 Epoch 1 | Training | Elapsed Time: 0:12:19 | Steps: 31 | Loss: 178.690032
Epoch 1 | Validation | Elapsed Time: 0:00:18 | Steps: 15 | Loss: 167.403382 | Dataset: /home/dev_ds/deepspeech_dir_2/corpus/corpus-dev.csv WARNING:tensorflow:From /home/dev_ds/deepspeech_dir_2/DeepSpeech-0.5.1/venv2/lib/python3.6/site-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to delete files with this prefix. I Saved new best validating model with loss 167.403382 to: /home/dev_ds/.local/share/deepspeech/checkpoints/best_dev-6362 Epoch 2 | Training | Elapsed Time: 0:12:19 | Steps: 31 | Loss: 178.588967
Epoch 2 | Validation | Elapsed Time: 0:00:18 | Steps: 15 | Loss: 167.700894 | Dataset:

/home/dev_ds/deepspeech_dir_2/corpus/corpus-dev.csv Epoch 3 | Training | Elapsed Time: 0:12:10 | Steps: 31 | Loss: 178.937192
Epoch 3 | Validation | Elapsed Time: 0:00:18 | Steps: 15 | Loss: 167.505259 | Dataset: /home/dev_ds/deepspeech_dir_2/corpus/corpus-dev.csv I Early stop triggered as (for last 4 steps) validation loss: 167.505259 with standard deviation: 0.157128 and mean: 167.622626 I FINISHED optimization in 0:50:25.282585 Error: Trie file version mismatch (4 instead of expected 3). Update your trie file. terminate called after throwing an instance of 'int'

Kindly help me figure out the error.

lissyx commented 5 years ago

You don't need to rebuild generate_trie, use the one packaged. Also use - - branch v0.5.1 for taskcluster.py

vibhashahani commented 5 years ago

Where is the packaged trie file located? This is my DeepSpeech directory. Screenshot_2019-07-24_11-17-01

lissyx commented 5 years ago

Where is the packaged trie file located? This is my DeepSpeech directory. Screenshot_2019-07-24_11-17-01

Please avoid screenshots. As I said, it's in native_client.tar.xz

smalissa commented 5 years ago

Hi. pleaze help me i did the same as you vibhashahani I am trying to run DeepSpeech on my small data set. but i don't understand the step to create a trie even i use the generate_trie in native_client.tar.xz but i cant successd . can you please explain in details this step

kdavis-mozilla commented 5 years ago

@suhad999 Could you indicate the steps you have already taken?

smalissa commented 5 years ago

I am trying to run DeepSpeech on a small data set in another language other than english. i have mac book air

Steps I followed:

Cloned deepspeech from github repository

pip3 install -r requirements.txt

python util/taskcluster.py --arch gpu --target native_client

prepare the data: train (6 wav files),dev(3 wav files), test (2 wav files),all wave are mono wave and cvs files ( file name, text) like this : /Desktop⁩/suhad/s112/a1/test/112_1_3475683015.wav,قل هو الله أحد

Then for creating language model: git clone https://github.com/kpu/kenlm.git (cmake and make ) alphabet.txt (containing all arabic alphabets), every word in one line like this : قل هو أحد الله الصمد لم يلد ولم يولد يكن
له كفوا

some.txt (corpus of words cover all the words in wav files)

../kenlm/build/bin/lmplz -o 5 lm.arpa

../kenlm/build/bin/build_binary lm.arpa lm.binary

../DeepSpeech/native_client/generate_trie alphabet.txt lm.binary trie

Then ran the following script:

!/bin/sh

set -xe if [ ! -f DeepSpeech.py ]; then echo "Please make sure you run this from DeepSpeech's top level directory." exit 1 fi;

python -u DeepSpeech.py \ --train_files "/Desktop⁩/suhad/s112/a1/train/train.cvs" \ --dev_files "/Desktop⁩/suhad/s112/a1/dev/dev.cvs" \ --test_files "/Desktop⁩/suhad/s112/a1/test/test.cvs" \ --alphabet_config_path "/Desktop⁩/arabic_models/alphabet.txt" \ --lm_binary_path "/Desktop⁩/arabic_models/lm.binary" \ --lm_trie_path "/Desktop⁩/arabic_models/trie" \ --decoder_library_path "/Desktop⁩/mozilla_deepspeech/DeepSpeech/libdeepspeech.so" \ --train_batch_size 80 \ --dev_batch_size 80 \ --test_batch_size 40 \ --n_hidden 375 \ --epoch 33 \ --validation_step 1 \ --early_stop True \ --earlystop_nsteps 6 \ --estop_mean_thresh 0.1 \ --estop_std_thresh 0.1 \ --dropout_rate 0.22 \ --learning_rate 0.00095 \ --report_count 100 \ --use_seq_length False \

--export_dir "/Desktop⁩/arabic_models/results" "$@" I am getting the following error: Traceback (most recent call last): File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 528, in _assert_validators validator.verify(self) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_validators.py", line 82, in verify raise _exceptions.ValidationError(self.message) absl.flags._exceptions.ValidationError: The file pointed to by --lm_binary_path must exist and be readable.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "DeepSpeech.py", line 844, in tfv1.app.run(main) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/app.py", line 294, in run flags_parser, File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/app.py", line 351, in _run_init flags_parser=flags_parser, File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/app.py", line 213, in _register_and_parse_flags_with_usage args_to_main = flags_parser(original_argv) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 31, in _parse_flags_tolerate_undef return flags.FLAGS(_sys.argv if argv is None else argv, known_only=True) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/tensorflow/python/platform/flags.py", line 112, in call return self.dict['wrapped'].call__(*args, **kwargs) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 636, in call self._assert_all_validators() File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 510, in _assert_all_validators self._assert_validators(all_validators) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 531, in _assert_validators raise _exceptions.IllegalFlagValueError('%s: %s' % (message, str(e))) absl.flags._exceptions.IllegalFlagValueError: flag --lm_binary_path=/Desktop⁩/arabic_models/lm.binary: The file pointed to by --lm_binary_path must exist and be readable. (deepspeech_env) (base) suhads-MacBook-Air:DeepSpeech suhadalissa$ ./run.sh

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "DeepSpeech.py", line 844, in tfv1.app.run(main) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/app.py", line 294, in run flags_parser, File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/app.py", line 351, in _run_init flags_parser=flags_parser, File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/app.py", line 213, in _register_and_parse_flags_with_usage args_to_main = flags_parser(original_argv) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 31, in _parse_flags_tolerate_undef return flags.FLAGS(_sys.argv if argv is None else argv, known_only=True) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/tensorflow/python/platform/flags.py", line 112, in call return self.dict['wrapped'].call__(*args, **kwargs) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 636, in call self._assert_all_validators() File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 510, in _assert_all_validators self._assert_validators(all_validators) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/deepspeech_env/lib/python3.6/site-packages/absl/flags/_flagvalues.py", line 531, in _assert_validators raise _exceptions.IllegalFlagValueError('%s: %s' % (message, str(e))) absl.flags._exceptions.IllegalFlagValueError: flag --lm_binary_path=/Desktop⁩/arabic_models/lm.binary: The file pointed to by --lm_binary_path must exist and be readable. (deepspeech_env) (base) suhads-MacBook-Air:DeepSpeech suhadalissa$

Kindly help me figure out the error.

smalissa commented 5 years ago

i try again and get this error Traceback (most recent call last): File "DeepSpeech.py", line 844, in tfv1.app.run(main) File "//anaconda3/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "//anaconda3/lib/python3.7/site-packages/absl/app.py", line 300, in run _run_main(main, args) File "//anaconda3/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main sys.exit(main(argv)) File "DeepSpeech.py", line 828, in main train() File "DeepSpeech.py", line 409, in train cache_path=FLAGS.feature_cache) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/DeepSpeech/util/feeding.py", line 68, in create_dataset df = read_csvs(csvs) File "/Users/suhadalissa/Desktop/mozilla_deepspeech/DeepSpeech/util/feeding.py", line 22, in read_csvs file = pandas.read_csv(csv, encoding='utf-8', na_filter=False) File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 702, in parser_f return _read(filepath_or_buffer, kwds) File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 429, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine) File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1853, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source FileNotFoundError: [Errno 2] File b'/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs' does not exist: b'/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs'

kdavis-mozilla commented 5 years ago

These are both very different errors and the error log tells you what the problems are:

  1. The file pointed to by --lm_binary_path must exist and be readable.
  2. '/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs' does not exist

As this is not related to the original issue, could you move this discussion to discourse please.

smalissa commented 5 years ago

no iam re try the steps again and skip the previous issue i have problem in genereting the binary files. and now i dp this step right and re run the script again and get this error

Traceback (most recent call last):
File "DeepSpeech.py", line 844, in 
tfv1.app.run(main)
File "//anaconda3/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "//anaconda3/lib/python3.7/site-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "//anaconda3/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "DeepSpeech.py", line 828, in main
train()
File "DeepSpeech.py", line 409, in train
cache_path=FLAGS.feature_cache)
File "/Users/suhadalissa/Desktop/mozilla_deepspeech/DeepSpeech/util/feeding.py", line 68, in create_dataset
df = read_csvs(csvs)
File "/Users/suhadalissa/Desktop/mozilla_deepspeech/DeepSpeech/util/feeding.py", line 22, in read_csvs
file = pandas.read_csv(csv, encoding='utf-8', na_filter=False)
File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "//anaconda3/lib/python3.7/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs' does not exist: b'/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs'

Kindly help me figure out the error ? is the problem in tensorflow version , i have version =1.14.0

lissyx commented 5 years ago

FileNotFoundError: [Errno 2] File b'/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs' does not exist: b'/Users/suhadalissa/Desktop/suhad/s112/a1/train/train.cvs' Kindly help me figure out the error ? is the problem in tensorflow version , i have version =1.14.0

@suhad999 Do you read the error messages ? It says your file does not exists ...

lissyx commented 5 years ago

@vibhashahani Is your error fixed ? Can we close this ?

vibhashahani commented 5 years ago

Yes it got fixed. I used v 0.5.1 to fix the issue. Thanks a ton.

lissyx commented 5 years ago

Thanks!

smalissa commented 5 years ago

yes i fixed the trie file issue, but again i have another error, when i run my script , the training completed to the end , but the problem in testing , it could not finish or start the testing and give me this error: fatal python error. i try to solve it but i can’t success, please help me

Get Outlook for iOShttps://aka.ms/o0ukef


From: lissyx notifications@github.com Sent: Tuesday, August 20, 2019 6:24:27 PM To: mozilla/DeepSpeech DeepSpeech@noreply.github.com Cc: Suhad smalissa17@cit.just.edu.jo; Mention mention@noreply.github.com Subject: Re: [mozilla/DeepSpeech] Error: Trie file version mismatch (4 instead of expected 3). Update your trie file. terminate called after throwing an instance of 'int' (#2274)

Closed #2274https://github.com/mozilla/DeepSpeech/issues/2274.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mozilla/DeepSpeech/issues/2274?email_source=notifications&email_token=ALUHX2UVUBP3NZGYLL7N2RTQFQECXA5CNFSM4IHQNYSKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOTEYL7WQ#event-2570108890, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALUHX2X5TVU3PKU5OTPYL7DQFQECXANCNFSM4IHQNYSA.

lissyx commented 5 years ago

yes i fixed the trie file issue, but again i have another error, when i run my script , the training completed to the end , but the problem in testing , it could not finish or start the testing and give me this error: fatal python error. i try to solve it but i can’t success, please help me

As I said, it's not a bug. Please use Discourse for getting help. And please share more information than "fatal python error" ....

smalissa commented 5 years ago

ok i will do that. and about share more information about the error, like this [Image]


From: lissyx notifications@github.com Sent: Tuesday, August 20, 2019 6:59 PM To: mozilla/DeepSpeech Cc: Suhad; Mention Subject: Re: [mozilla/DeepSpeech] Error: Trie file version mismatch (4 instead of expected 3). Update your trie file. terminate called after throwing an instance of 'int' (#2274)

yes i fixed the trie file issue, but again i have another error, when i run my script , the training completed to the end , but the problem in testing , it could not finish or start the testing and give me this error: fatal python error. i try to solve it but i can’t success, please help me

As I said, it's not a bug. Please use Discourse for getting help. And please share more information than "fatal python error" ....

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mozilla/DeepSpeech/issues/2274?email_source=notifications&email_token=ALUHX2WFQIP4AKY6AES4YATQFQIHDA5CNFSM4IHQNYSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WZFRQ#issuecomment-523080390, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALUHX2UUIMNF3XZPHYIYFD3QFQIHDANCNFSM4IHQNYSA.

lissyx commented 5 years ago

and about share more information about the error, like this [Image]

Thanks for illustrating perfectly why you should never share screenshots but actual plain text content.

smalissa commented 5 years ago

ok i will do that

Get Outlook for iOShttps://aka.ms/o0ukef


From: lissyx notifications@github.com Sent: Tuesday, August 20, 2019 7:13:11 PM To: mozilla/DeepSpeech DeepSpeech@noreply.github.com Cc: Suhad smalissa17@cit.just.edu.jo; Mention mention@noreply.github.com Subject: Re: [mozilla/DeepSpeech] Error: Trie file version mismatch (4 instead of expected 3). Update your trie file. terminate called after throwing an instance of 'int' (#2274)

and about share more information about the error, like this [Image]

Thanks for illustrating perfectly why you should never share screenshots but actual plain text content.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/mozilla/DeepSpeech/issues/2274?email_source=notifications&email_token=ALUHX2T3RB7HKBG4DI7FGY3QFQJZPA5CNFSM4IHQNYSKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4W2QPQ#issuecomment-523085886, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ALUHX2W5TE24PR6JEH2E7ZDQFQJZPANCNFSM4IHQNYSA.

lock[bot] commented 5 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.