DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Mozilla Public License 2.0
25.43k
stars
3.98k
forks
source link
Alphabet size mismatch with model output shape | alphabet.GetSize()+1) == (class_dim) #3801
Using the CTC Beam Search Decoder of DeepSpeech I get the following error:
[ctc_beam_search_decoder.cpp:279] FATAL: "(alphabet.GetSize()+1) == (class_dim)" check failed. Number of output classes in acoustic model does not match number of labels in the alphabet file. Alphabet file must be the same one that was used to train the acoustic model.
I have controlled the alphabet and it has the size of 1023, even though I built it with 1024 characters. The output shape of the model is 1025. I believe the mismatch should be 1 character. I thought of blank or unk token, but I aint sure if that is the cause of the error.
If you've found a bug, or have a feature request, then please create an issue with the following information:
Have I written custom code (as opposed to running examples on an unmodified clone of the repository):
OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
TensorFlow installed from (our builds, or upstream TensorFlow):
TensorFlow version (use command below):
Python version:
Bazel version (if compiling from source):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:
Exact command to reproduce:
You can obtain the TensorFlow version with
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
Please describe the problem clearly. Be sure to convey here why it's a bug or a feature request.
Include any logs or source code that would be helpful to diagnose the problem. For larger logs, link to a Gist, not a screenshot. If including tracebacks, please include the full traceback. Try to provide a reproducible test case.
Using the CTC Beam Search Decoder of DeepSpeech I get the following error:
[ctc_beam_search_decoder.cpp:279] FATAL: "(alphabet.GetSize()+1) == (class_dim)" check failed. Number of output classes in acoustic model does not match number of labels in the alphabet file. Alphabet file must be the same one that was used to train the acoustic model.
I have controlled the alphabet and it has the size of 1023, even though I built it with 1024 characters. The output shape of the model is 1025. I believe the mismatch should be 1 character. I thought of blank or unk token, but I aint sure if that is the cause of the error.
Did you ever encounter this?
For support and discussions, please use our Discourse forums.
If you've found a bug, or have a feature request, then please create an issue with the following information:
You can obtain the TensorFlow version with
Please describe the problem clearly. Be sure to convey here why it's a bug or a feature request.
Include any logs or source code that would be helpful to diagnose the problem. For larger logs, link to a Gist, not a screenshot. If including tracebacks, please include the full traceback. Try to provide a reproducible test case.