Alphabet issue: "ValueError: Cannot feed value of shape (29,) for Tensor ‘layer_6/bias/Initializer/zeros:0’, which has shape ‘(35,)’"

dungeonstudent1 commented 3 years ago

Input:

Continuing training the deepspeech 0.9.3. model checkpoints on Russian alphabet + Russian train/test .wav files
OS: Ubuntu 20.04
Tensorflow: tensorflow:1.15.2-gpu-py3
Python: 3.6.9.

According to the documentation, setting --drop_source_layers 1 to 5 helps to avoid the incompatible geometry - alphabet error. Citing: "All dropped layers will be reinitialized, and (crucially) the output layer will be defined to match your supplied target alphabet."(c)

However, it looks as if the Russian alphabet cannot be used to continue training from English deepspeech 0.9.3. checkpoints. Would this be a bug? Or is it impossible to continue training with Russian files, and I need to initialize the weights randomly?

Thank you for your time. Error logs:

I FINISHED optimization in 2:05:52.762198 I Loading best validating checkpoint from /data/DeepSpeech/load_check/deepspeech-0.9.3-checkpoint/best_dev-1466475 I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/bias I Loading variable from checkpoint: cudnn_lstm/rnn/multi_rnn_cell/cell_0/cudnn_compatible_lstm_cell/kernel I Loading variable from checkpoint: global_step I Loading variable from checkpoint: layer_1/bias I Loading variable from checkpoint: layer_1/weights I Loading variable from checkpoint: layer_2/bias I Loading variable from checkpoint: layer_2/weights I Loading variable from checkpoint: layer_3/bias I Loading variable from checkpoint: layer_3/weights I Loading variable from checkpoint: layer_5/bias I Loading variable from checkpoint: layer_5/weights I Loading variable from checkpoint: layer_6/bias Traceback (most recent call last): File "DeepSpeech.py", line 12, in <module> ds_train.run_script() File "/data/DeepSpeech/training/deepspeech_training/train.py", line 982, in run_script absl.app.run(main) File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run _run_main(main, args) File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main sys.exit(main(argv)) File "/data/DeepSpeech/training/deepspeech_training/train.py", line 958, in main test() File "/data/DeepSpeech/training/deepspeech_training/train.py", line 682, in test samples = evaluate(FLAGS.test_files.split(','), create_model) File "/data/DeepSpeech/training/deepspeech_training/evaluate.py", line 87, in evaluate load_graph_for_evaluation(session) File "/data/DeepSpeech/training/deepspeech_training/util/checkpoints.py", line 151, in load_graph_for_evaluation _load_or_init_impl(session, methods, allow_drop_layers=False) File "/data/DeepSpeech/training/deepspeech_training/util/checkpoints.py", line 98, in _load_or_init_impl return _load_checkpoint(session, ckpt_path, allow_drop_layers, allow_lr_init=allow_lr_init) File "/data/DeepSpeech/training/deepspeech_training/util/checkpoints.py", line 71, in _load_checkpoint v.load(ckpt.get_tensor(v.op.name), session=session) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/variables.py", line 1033, in load session.run(self.initializer, {self.initializer.inputs[1]: value}) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 956, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1156, in _run (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (29,) for Tensor 'layer_6/bias/Initializer/zeros:0', which has shape '(35,)'

ftyers commented 3 years ago

Thanks for the report, this is a known issue when testing using transfer learning directly after training. You can stop the test and rerun it independently from the checkpoint and it should work.

dungeonstudent1 commented 3 years ago

@ftyers Dear Mr. Tyers, thank you for a swift reply! Could you be so kind as to clarify the steps for stopping the test and rerunning it independently? Currently, the testing begins after training ends with "FINISHED OPTIMIZATION" automatically. Thank you for a moment of your time.

ftyers commented 3 years ago

Here is the command I use. But for further assistance, please get in touch on Matrix or Discourse.

dungeonstudent1 commented 3 years ago

@ftyers Dear Mr. Tyers, thank you very much for the script, your attention and your time!

mozilla / DeepSpeech

Alphabet issue: "ValueError: Cannot feed value of shape (29,) for Tensor ‘layer_6/bias/Initializer/zeros:0’, which has shape ‘(35,)’" #3665