TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.8k stars 810 forks source link

Error when train Multispeaker in Libritts #751

Closed Tian14267 closed 2 years ago

Tian14267 commented 2 years ago

I get this error when train data Libritts ( Multispeaker )

2022-03-16 14:28:11.050591: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 784 of 3001
2022-03-16 14:28:21.066931: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 1644 of 3001
2022-03-16 14:28:31.064652: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 2501 of 3001
2022-03-16 14:28:37.091093: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:228] Shuffle buffer filled.
2022-03-16 14:28:59.205111: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1979] Converted 1185/11324 nodes to float16 precision using 125 cast(s) to float16 (excluding Const and Variable casts)
2022-03-16 14:29:01.199869: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } }
2022-03-16 14:29:01.204394: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } }
2022-03-16 14:29:01.208638: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } }
2022-03-16 14:29:01.212858: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } }
2022-03-16 14:29:03.176784: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1979] Converted 0/9388 nodes to float16 precision using 0 cast(s) to float16 (excluding Const and Variable casts)
2022-03-16 14:29:05.830962: W tensorflow/core/framework/op_kernel.cc:1680] Invalid argument: required broadcastable shapes
Traceback (most recent call last):
  File "train_fastspeech2_libritts.py", line 498, in <module>
    main()
  File "train_fastspeech2_libritts.py", line 490, in main
    resume=args.resume,
  File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 1010, in fit
    self.run()
  File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 104, in run
    self._train_epoch()
  File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 126, in _train_epoch
    self._train_step(batch)
  File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 782, in _train_step
    self.one_step_forward(batch)
  File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 950, in _call
    return self._stateless_fn(*args, **kwds)
  File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3040, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 596, in call
    ctx=ctx)
  File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  required broadcastable shapes
     [[node tf_fast_speech2/add_1 (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech2.py:185) ]]
     [[tf_fast_speech2/length_regulator/while/loop_body_control/_117/_139]]
  (1) Invalid argument:  required broadcastable shapes
     [[node tf_fast_speech2/add_1 (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech2.py:185) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference__one_step_forward_33805]

Errors may have originated from an input operation.
Input Source operations connected to node tf_fast_speech2/add_1:
 tf_fast_speech2/encoder/layer_._3/mul (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech.py:413)

Input Source operations connected to node tf_fast_speech2/add_1:
 tf_fast_speech2/encoder/layer_._3/mul (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech.py:413)

Function call stack:
_one_step_forward -> _one_step_forward

[train]:   0%|                                                                                                                                                                                                                                       | 0/150000 [01:05<?, ?it/s]

Does somebody know how to solve it ?