Closed testdeepv closed 5 years ago
SyntaxError: Non-UTF-8 code starting with '\x83' in file deepspeech on line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details Any suggestions to resolve this please ?
This is produced by Python itself, and clearly not something I reproduce on my french system. Can you make sure your pip install is uptodate ?
When I run this command : python3.6 -m pip --version I get : pip 18.1 from /usr/local/lib/python3.6/site-packages/pip (python 3.6) I have to upgrade it ?
Strange that your pip is in /usr/local
, can you make sure your setup is straight ?
What do you mean by "setup is straight" ?
What do you mean by "setup is straight" ?
Well, /usr/local
feels like a non-default distro setup. The deepspeech
file is being generated at install time.
I'm running the deepspeech file from the Deepspeech native client
I'm running the deepspeech file from the Deepspeech native client
Well, you mention python3.6
above. So I suspect you did pip install
?
alex@portable-alex:~/tmp/deepspeech/issue2164$ source venv/bin/activate
(venv) alex@portable-alex:~/tmp/deepspeech/issue2164$ pip install deepspeech==0.5.0a11
Collecting deepspeech==0.5.0a11
Downloading https://files.pythonhosted.org/packages/b2/fd/bdcb51eae62e6df60a252e8395d49ef145fa101139b530b8e81448ca336e/deepspeech-0.5.0a11-cp37-cp37m-manylinux1_x86_64.whl (15.6MB)
|████████████████████████████████| 15.6MB 4.9MB/s
Collecting numpy>=1.14.5 (from deepspeech==0.5.0a11)
Downloading https://files.pythonhosted.org/packages/fc/d1/45be1144b03b6b1e24f9a924f23f66b4ad030d834ad31fb9e5581bd328af/numpy-1.16.4-cp37-cp37m-manylinux1_x86_64.whl (17.3MB)
|████████████████████████████████| 17.3MB 65.5MB/s
Installing collected packages: numpy, deepspeech
Successfully installed deepspeech-0.5.0a11 numpy-1.16.4
(venv) alex@portable-alex:~/tmp/deepspeech/issue2164$ deepspeech
usage: deepspeech [-h] --model MODEL --alphabet ALPHABET [--lm [LM]]
[--trie [TRIE]] --audio AUDIO [--version] [--extended]
deepspeech: error: the following arguments are required: --model, --alphabet, --audio
(venv) alex@portable-alex:~/tmp/deepspeech/issue2164$ which deepspeech
/home/alex/tmp/deepspeech/issue2164/venv/bin/deepspeech
@testdeepv The file /home/alex/tmp/deepspeech/issue2164/venv/bin/deepspeech
is being generated at pip install
time. According to your error, it's the one with bogus UTF-8. But we don't control it.
I installed python3.6 because in the VM I'm using python3.5 is the default python3. I git cloned deepspeech and mozilla tensorflow, build both of them, generated binaries and trained a french model. I didn't pip install deepspeech, I have it in the deepspeech native client after the build
I didn't pip install deepspeech, I have it in the deepspeech native client after the build
Well then please document exactly what you did.
I git cloned deepspeech and mozilla tensorflow, build both of them, generated binaries and trained a french model.
Why did you do this ? We have prebuilt binaries, you don't have to do that.
I installed python3.6 because in the VM I'm using python3.5 is the default python3.
Python 3.5 should work as well.
trained a french model.
Also, could you please join efforts ? https://github.com/Common-Voice/commonvoice-fr/pull/44 https://github.com/Common-Voice/commonvoice-fr https://discourse.mozilla.org/c/voice/fr
I git cloned deepspeech and mozilla tensorflow, build both of them, generated binaries and trained a french model.
Why did you do this ? We have prebuilt binaries, you don't have to do that. I changed alphabet.txt to add french caracters and then created lm.binary and trie files
I git cloned deepspeech and mozilla tensorflow, build both of them, generated binaries and trained a french model.
Why did you do this ? We have prebuilt binaries, you don't have to do that. I changed alphabet.txt to add french caracters and then created lm.binary and trie files
Still, you don't need to rebuild just to change alphabet.
then I have to do pip install deepspeech and do not use the deepspeech I have in native client file ?
then I have to do pip install deepspeech and do not use the deepspeech I have in native client file ?
There is no good reason in your case to have to rebuild everything, yes. Also, sorry to insist, but it's really important that you join efforts to help produce a french model ...
trained a french model.
Also, could you please join efforts ? Common-Voice/commonvoice-fr#44 https://github.com/Common-Voice/commonvoice-fr https://discourse.mozilla.org/c/voice/fr
I will sure do it :)
then I have to do pip install deepspeech and do not use the deepspeech I have in native client file ?
There is no good reason in your case to have to rebuild everything, yes. Also, sorry to insist, but it's really important that you join efforts to help produce a french model ...
I did this : sudo python3.6 -m pip install deepspeech==0.5.0a11 and when doing this command : python3.6 deepspeech --model ~/results/model_export/output_graph.pb --alphabet ~/Deepspeech/data/alphabet.txt --lm ~/DeepSpeech/data/lm/lm.binary --trie ~/DeepSpeech/data/lm/trie --audio test.wav -t I still get an error: python3.6: can't open file 'deepspeech': [Errno 2] No such file or directory
sudo python3.6 -m pip install deepspeech==0.5.0a11
You should really follow the docs and use virtualenv
, installing as root is not a good practice.
python3.6: can't open file 'deepspeech': [Errno 2] No such file or directory
That's another issue now. What does which deepspeech
and ls -hal $(which deepspeech)
gives?
sudo python3.6 -m pip install deepspeech==0.5.0a11
You should really follow the docs and use
virtualenv
, installing as root is not a good practice. OKpython3.6: can't open file 'deepspeech': [Errno 2] No such file or directory
That's another issue now. What does
which deepspeech
andls -hal $(which deepspeech)
gives?
which deepspeech output : /usr/local/bin/deepspeech 'ls -hal $(which deepspeech)' gives : -rwxr-xr-x 1 root root 228 Jun 11 10:09 /usr/local/bin/deepspeech
'ls -hal $(which deepspeech)' gives : -rwxr-xr-x 1 root root 228 Jun 11 10:09 /usr/local/bin/deepspeech
Can you paste its content ?
#!/usr/local/bin/python3.6
# -*- coding: utf-8 -*-
import re
import sys
from deepspeech.client import main
if __name__ == '__main__':
sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
sys.exit(main())
@testdeepv Strange. Can you please properly uninstall and then reinstall, using your distro's Python/PIP and a virtualenv as we document ?
may be I have to install the gpu version of deepspeech ?
may be I have to install the gpu version of deepspeech ?
No, it's unrelated.
when i did this and specify the deepspeech path :
python3.6 /usr/local/bin/deepspeech --model ~/results/model_export/output_graph.pb --alphabet ~/Deepspeech/data/alphabet.txt --lm ~/DeepSpeech/data/lm/lm.binary --trie ~/DeepSpeech/data/lm/trie --audio test.wav -t
I get this :
Loading model from file ~/results/model_export/output_graph.pb
TensorFlow: v1.13.1-10-g3e0cc53
DeepSpeech: v0.5.0-alpha.11-0-g1201739
Warning: reading entire model file into memory. Transform model file into an mmapped graph to reduce heap usage.
2019-06-11 10:31:48.501980: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions tha
t this TensorFlow binary was not compiled to use: AVX2 FMA
2019-06-11 10:31:48.511050: E tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR
_NO_DEVICE: no CUDA-capable device is detected
2019-06-11 10:31:48.511184: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:148] kernel driver does not appea
r to be running on this host (instance-2): /proc/driver/nvidia/version does not exist
2019-06-11 10:31:48.570438: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" de
vice_type: "CPU"') for unknown op: UnwrapDatasetVariant
2019-06-11 10:31:48.570541: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" devi
ce_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: WrapDatasetVarian
t
2019-06-11 10:31:48.570554: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "WrapDatasetVariant" devi
ce_type: "CPU"') for unknown op: WrapDatasetVariant
2019-06-11 10:31:48.570798: E tensorflow/core/framework/op_kernel.cc:1325] OpKernel ('op: "UnwrapDatasetVariant" de
vice_type: "GPU" host_memory_arg: "input_handle" host_memory_arg: "output_handle"') for unknown op: UnwrapDatasetVa
riant
Loaded model in 0.0733s.
Loading language model from files~/DeepSpeech/data/lm/lm.binary ~/DeepSpeech/data/lm/trie
Loaded language model in 0.0135s.
Running inference.
Inference took 2.695s for 7.160s audio file.
I can't understand all this tensorflow warnings :(
@testdeepv The warnings are harmless. It seems to work.
But I didn't get inferences :(
I have to convert my output_graph.pb like this or not ? $ convert_graphdef_memmapped_format --in_graph=output_graph.pb --out_graph=output_graph.pbmm
I have to convert my output_graph.pb like this or not ? $ convert_graphdef_memmapped_format --in_graph=output_graph.pb --out_graph=output_graph.pbmm
That just means your training was not enough. Hence why I insist on contributing to french model, because you are not the first one training and getting empty inferences, because of not enough data / training / improper parameters.
@testdeepv If you need a model that works right now, there's no best solution than to train on top of english (not yet) released 0.5.0 and with other dataset, as documented in WIP PR https://github.com/Common-Voice/commonvoice-fr/pull/44 as well as https://discourse.mozilla.org/t/un-premier-modele-francais/41100/7
I have to convert my output_graph.pb like this or not ? $ convert_graphdef_memmapped_format --in_graph=output_graph.pb --out_graph=output_graph.pbmm
That just means your training was not enough. Hence why I insist on contributing to french model, because you are not the first one training and getting empty inferences, because of not enough data / training / improper parameters.
For the wav files in the test file I get inferences
but when I want to test the exported model I don't get anything
I have to convert my output_graph.pb like this or not ? $ convert_graphdef_memmapped_format --in_graph=output_graph.pb --out_graph=output_graph.pbmm
That just means your training was not enough. Hence why I insist on contributing to french model, because you are not the first one training and getting empty inferences, because of not enough data / training / improper parameters.
For the wav files in the test file I get inferences
but when I want to test the exported model I don't get anything
Please avoid posting screenshots, it's very hard to use. Again, that's expected bahavior. Without the full training log it's hard to be definitive, but it's really not surprising ...
@testdeepv Should we close ?
just a question concerning the amount of data to get good inferences (50 hours isn't enough ?) getting more data is the best solution to avoid getting empty inferences ?
Yes, 50 hours is way way way not enough.
getting more data is the best solution to avoid getting empty inferences ?
That and ensuring proper training. Since you have not shared your parameters, I can't tell if it also has a play in your case.
the command to train deepspeech.py :
python3.6 -u DeepSpeech.py --train_files ~/deepspeech_dataset/clips/train.csv --dev_files ~/deepspeech_dataset/clips/dev.csv --test_files ~/deepspeech_dataset/clips/test.csv --train_batch_size 80 --dev_batch_size 80 --test_batch_size 40 --n_hidden 1024 --epoch 50 --use_seq_length False --report_count 100 --remove_export True --checkpoint_dir ~/results/checkpoints/ --export_dir ~/results/model_export/ --alphabet_config_path ~/DeepSpeech/data/alphabet.txt --lm_binary_path ~/DeepSpeech/data/lm/lm.binary --lm_trie_path ~/DeepSpeech/data/lm/trie
and the training stops after 10 epochs with this message : I Early stop triggered as (for last 4 steps) validation loss: 68.492657 with standard deviation: 0.667485 and mean: 67.559747 I FINISHED optimization in after that I get the inferences for test.csv (and they were not empty)
Are you saying when you run the client on the same audio files that are in your test CSV file, it gives different results than the training code?
Are you saying when you run the client on the same audio files that are in your test CSV file, it gives different results than the training code?
my wave file was tested during the learning process and it gives a result. but when I tried to test the same file with my exported model it gives me empty inferences...
my wave file was tested during the learning process and it gives a result. but when I tried to test the same file with my exported model it gives me empty inferences...
That should not happen. 50 hours and 10 epochs is obviously not enough, but if you test a file from the test set, you should get the same result.
my wave file was tested during the learning process and it gives a result. but when I tried to test the same file with my exported model it gives me empty inferences...
That should not happen. 50 hours and 10 epochs is obviously not enough, but if you test a file from the test set, you should get the same result.
I made epochs = 50 but the training process stopped after 10 epochs and I'm not getting the same result for the same file tested
Could you share the full training log ?
tf.py_function, which takes a python function which manipulates tf eager
tensors instead of numpy arrays. It's easy to convert a tf eager tensor to
an ndarray (just call tensor.numpy()) but having access to eager tensors
means `tf.py_function`s can use accelerators such as GPUs as well as
being differentiable using a gradient tape.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py:358: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/lstm_ops.py:696: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
I Restored variables from most recent checkpoint at ~/results/checkpoints/train-158, step 158
I STARTING Optimization
Epoch 0 | Training | Elapsed Time: 1:02:16 | Steps: 159 | Loss: 132.824389 WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/training/saver.py:966: remove_checkpoint (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
then I have steps for each epoch and at the end of each epoch I had this :
I Saved new best validating model with loss x to: ~/results/checkpoints/best_dev-325
after 10 epochs I get this message :
I Early stop triggered as (for last 4 steps) validation loss: 68.492657 with standard deviation: 0.667485 and mean: 67.559747
I FINISHED optimization in 14:35:47.888453
I Restored variables from best validation checkpoint at ~/results/checkpoints/best_dev-1494, step 1494
Testing model on ~/deepspeech_dataset/clips/test.csv
Test epoch | Steps: 158 | Elapsed Time: 0:24:45
Test on ~/deepspeech_dataset/clips/test.csv - WER: 0.709969, CER: 0.413470, loss: 68.295815
WER: 1.500000, CER: 0.333333, loss: 28.491713
- src: "en substitution"
- res: "on se situation"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.647059, loss: 29.111736
- src: "vingtdeux maisons"
- res: "va de mal"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.833333, loss: 29.264719
- src: "depuis quand"
- res: "deux plus fort"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.583333, loss: 30.396891
- src: "où habitestu"
- res: "ou vite que"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.375000, loss: 30.682699
- src: "avis défavorable"
- res: "avenue de favorable"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.470588, loss: 30.730188
- src: "rendeznous jospin"
- res: "route nous juste"
--------------------------------------------------------------------------------
WER: 1.500000, CER: 0.588235, loss: 31.368996
- src: "habillezvous vite"
- res: "aviez ou les"
--------------------------------------------------------------------------------
I Exporting the model...
I Removing old export
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/tools/freeze_graph.py:232: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/graph_util_impl.py:245: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
I Models exported at ~/results/model_export/
@testdeepv Can you ensure (sha1 fingerprint) that the alphabets are the same ?
didn't get what you mean by "(sha1 fingerprint)" :/
sha1sum alphabet.txt
sha1sum alphabet.txt
this command gives : 17d3fd2c19e31be7fdda16f1355053f1b8ca4612 alphabet.txt
sha1sum alphabet.txt
this command gives : 17d3fd2c19e31be7fdda16f1355053f1b8ca4612 alphabet.txt
Can your quadruple check you are absolutely using the same and the correct alphabet, lm.binary and trie files? 99.99% of the "empty inferences", outside of improper training, were related to that.
I made in alphabet.txt the french caracters I generated lm.binary like this :
kenlm/build/bin/./lmplz --text ~/DeepSpeech/data/vocabulary.txt --arpa ~/DeepSpeech/data/words.arpa --o 5
kenlm/build/bin/./build_binary -T -s ~/DeepSpeech/data/words.arpa ~/DeepSpeech/data/lm/lm.binary
and generated trie like this :
~/tensorflow/bazel-bin/native_client/generate_trie ~/DeepSpeech/data/alphabet.txt ~/DeepSpeech/data/lm.binary ~/DeepSpeech/data/trie
how can I check this ?
I trained a french model on a small french dataset and when I tried to do inferences using the exported model like this : python3.6 deepspeech --model ~/results/model_export/output_graph.pb --alphabet ~/Deepspeech/data/alphabet.txt --lm ~/DeepSpeech/data/lm/lm.binary --trie ~/DeepSpeech/data/lm/trie --audio test.wav -t I got this error : SyntaxError: Non-UTF-8 code starting with '\x83' in file deepspeech on line 2, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details Any suggestions to resolve this please ?