Closed FrontierDK closed 2 years ago
Seems like an alphabet size error between the alphabet fed and the model. Are you sure you transfer-trained Danish and used the same (Danish) alphabet everywhere? You should either do that or "normalize" your Danish commands to only include the English alphabet (e.g. use "a" instead of "æ").
Omitting the Danish letters isn't an option, as the SR will be used by random web visitors.
I have tried removing the alphabet.txt and using this, I get a new error:
I enter _python -m coqui_stt_training.util.lm_optimize --scorer_path ~/kenlm.scorer --auto_input_dataset ~/talefiler/sample.csv --checkpoint_dir ~/coqui-stt-1.4.0-checkpoint --n_trials 6 --n_hidden 256 --lm_alpha_max 1 --lm_betamax 2
I get ValueError: Cannot feed value of shape (32,) for Tensor 'layer_6/bias/Initializer/zeros:0', which has shape '(21,)'
The long log...
(coqui-stt-train-venv) bruger@kubuntudk1:~$ python -m coqui_stt_training.util.lm_optimize --scorer_path ~/kenlm.scorer --auto_input_dataset ~/talefiler/sample.csv --checkpoint_dir ~/coqui-stt-1.4.0-checkpoint --n_trials 6 --n_hidden 256 --lm_alpha_max 1 --lm_beta_max 2 > result.txt
[I 2022-07-06 09:58:59,594] A new study created in memory with name: no-name-557bbd71-6245-4c19-bf4a-e11f0f9da1f5
[W 2022-07-06 09:59:01,692] Trial 0 failed because of the following error: ValueError("Cannot feed value of shape (32,) for Tensor 'layer_6/bias/Initializer/zeros:0', which has shape '(21,)'")
Traceback (most recent call last):
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/lm_optimize.py", line 39, in objective
current_samples = evaluate([test_file], create_model)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/evaluate.py", line 99, in evaluate
load_graph_for_evaluation(session)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 233, in load_graph_for_evaluation
_load_or_init_impl(session, methods, allow_drop_layers=False, silent=silent)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 171, in _load_or_init_impl
silent=silent,
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 127, in _load_checkpoint
load_cudnn=Config.load_cudnn,
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 91, in _load_checkpoint_impl
v.load(ckpt.get_tensor(v.op.name), session=session)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/ops/variables.py", line 1033, in load
session.run(self.initializer, {self.initializer.inputs[1]: value})
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1156, in _run
(np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (32,) for Tensor 'layer_6/bias/Initializer/zeros:0', which has shape '(21,)'
Traceback (most recent call last):
File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/lm_optimize.py", line 97, in <module>
main()
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/lm_optimize.py", line 86, in main
results = compute_lm_optimization()
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/lm_optimize.py", line 59, in compute_lm_optimization
study.optimize(objective, n_jobs=1, n_trials=Config.n_trials)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/optuna/study/study.py", line 409, in optimize
show_progress_bar=show_progress_bar,
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/optuna/study/_optimize.py", line 76, in _optimize
progress_bar=progress_bar,
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/optuna/study/_optimize.py", line 163, in _optimize_sequential
trial = _run_trial(study, func, catch)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/optuna/study/_optimize.py", line 264, in _run_trial
raise func_err
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/optuna/study/_optimize.py", line 213, in _run_trial
value_or_values = func(trial)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/lm_optimize.py", line 39, in objective
current_samples = evaluate([test_file], create_model)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/evaluate.py", line 99, in evaluate
load_graph_for_evaluation(session)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 233, in load_graph_for_evaluation
_load_or_init_impl(session, methods, allow_drop_layers=False, silent=silent)
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 171, in _load_or_init_impl
silent=silent,
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 127, in _load_checkpoint
load_cudnn=Config.load_cudnn,
File "/home/bruger/Python-3.7.6/STT/training/coqui_stt_training/util/checkpoints.py", line 91, in _load_checkpoint_impl
v.load(ckpt.get_tensor(v.op.name), session=session)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/ops/variables.py", line 1033, in load
session.run(self.initializer, {self.initializer.inputs[1]: value})
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 956, in run
run_metadata_ptr)
File "/home/bruger/Python-3.7.6/coqui-stt-train-venv/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1156, in _run
(np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (32,) for Tensor 'layer_6/bias/Initializer/zeros:0', which has shape '(21,)'
Please read this: https://stt.readthedocs.io/en/latest/TRANSFER_LEARNING.html
Since this is my first Danish model, I can't use transfer learning.
Is there a way to have the bug fixed? I get that it's hard to fix a bug where one can't test, so here is my very simple dataset: training
Sorry, but this is not a bug, this is how deep learning models work. As indicated in the aforementioned resource, the alphabet is the last layer, your result, and (repeating from that resource) it is "crucial". Think of a model which is trained on English "yes" and "no" and you say "सेब" ("apple" in Hindi)...
You cannot expect an English model to understand every language. For example, the whole Common Voice project is dedicated to this.
Every character in the (latin) alphabet (or group of characters) is spoken differently in other languages. Compare "ei" vs "ie" in English and German for example. Therefore, for a new language, you either generate a model or find an already generated one.
I urge you to generate a Danish model...
Since this is my first Danish model, I can't use transfer learning.
No, you will transfer learn from the available English model using e.g. Danish dataset in Common Voice. Read the documents, then come to matrix, everybody will help you.
I urge you to generate a Danish model...
But how do I create the first model? Seems like an egg-and-chicken problem... I need a Danish model to train a Danish model. Somehow, some one created the first...
Transfer Learning in Voice AI (STT) is defined as "transfering the knowledge gained from another language to your language in hand". As explained in the document I shared, if you have the same alphabet, you "fine-tune", otherwise, you drop the last 2-3 layers from the model and teach what is different in you new language.
Resources for you:
Also this one of course: https://github.com/coqui-ai/STT/blob/main/notebooks/easy_transfer_learning.ipynb
HarikalarKutusu, you have made a very good video - thank you for sharing it :)
Specifying different loading and saving folders kinda work, but I get this warning: _WARNING: You specified different values for --load_checkpoint_dir and --save_checkpoint_dir, but you are running training and testing in a single invocation. The testing phase has been disable to prevent unexpected behavior of testing on the base checkpoint rather than the trained one. You should train and evaluate in two separate commands, specifying the correct --load_checkpointdir in both cases.
but you are running training and testing in a single invocation.
Yes, there is such an issue in transfer learning. Therefore you need to first train, then evaluate separately. Please check the notebook I shared (third point above) for .train and .evaluate calls.
Basically, when you don't give the test set to .train, so it only trains. This way that warning goes away...
It seems this kinda has the same bug. I can do the training (epoch 300), but when I run the evaluate process...I get the same error:
I enter _python3 -m coqui_stt_training.evaluate --show_progressbar true --train_cudnn false --test_files ~/danish/test.csv --checkpoint_dir ~/coqui-stt-1.4.0-checkpoint --alphabet_configpath ~/danish/alphabet.txt
I get _ValueError: Cannot feed value of shape (1024,) for Tensor 'cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstmcell/bias/Initializer/Const:0', which has shape '(8192,)'
What was n_hidden during training? Please read:
It was 2048 - like the original minimalistic checkpoint from Coqui. If I try entering any other value, I get this warning: _W WARNING: --nhidden value (256) is different from value found in checkpoint (2048).
I can export and use the model for recognition, albeit with a high WERR (more than 20%).
albeit with a high WERR (more than 20%).
This seems normal to me. It depends on the amount of the data in the dataset and its quality.
Glad that you get it working. When you solved the problems, perhaps remove "bug" tag and close this?
HarikalarKutusu, I'll do that. I have come around the bug by letting Coqui create an aphabet.txt on it's own too, and then use that. On a small dataset, it doesn't use all the letters - and using a full alphabet seems to crash Coqui. But working around it, I have been able to create working scorers + models, not even using transfer-train. This + using low n_hidden value got me down to 0.5% error-rate.
Thank you very much for your help, your video is very well made and I appreciate your help.
Hi all :)
I am using commands which do work on a US english dataset, and I am now trying on a very small dataset (only 15 lines/wave files), which have Danish letters like æ, ø and å. I sync'ed with Github about a week ago.
When running this command: _python3 ~/Python-3.7.6/STT/lm_optimizer.py --alphabet_config_path ~/Python-3.7.6/STT/data/alphabet.txt --scorer_path ~/kenlm.scorer --test_files ~/talefiler/train.csv ~/talefiler/dev.csv --checkpoint_dir ~/coqui-stt-1.4.0-checkpoint --n_trials 6 --n_hidden 256 --lm_alpha_max 5 --lm_betamax 5
I get this error: _ValueError: Cannot feed value of shape (32,) for Tensor 'layer6/bias/Initializer/zeros:0', which has shape '(29,)'
Longer log here: