Closed Tian14267 closed 2 years ago
I get this error when train data Libritts ( Multispeaker )
Libritts
2022-03-16 14:28:11.050591: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 784 of 3001 2022-03-16 14:28:21.066931: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 1644 of 3001 2022-03-16 14:28:31.064652: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:175] Filling up shuffle buffer (this may take a while): 2501 of 3001 2022-03-16 14:28:37.091093: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:228] Shuffle buffer filled. 2022-03-16 14:28:59.205111: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1979] Converted 1185/11324 nodes to float16 precision using 125 cast(s) to float16 (excluding Const and Variable casts) 2022-03-16 14:29:01.199869: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } } 2022-03-16 14:29:01.204394: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } } 2022-03-16 14:29:01.208638: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } } 2022-03-16 14:29:01.212858: W tensorflow/core/grappler/costs/op_level_cost_estimator.cc:689] Error in PredictCost() for the op: op: "Softmax" attr { key: "T" value { type: DT_HALF } } inputs { dtype: DT_HALF shape { unknown_rank: true } } device { type: "GPU" vendor: "NVIDIA" model: "Tesla V100-PCIE-32GB" frequency: 1380 num_cores: 80 environment { key: "architecture" value: "7.0" } environment { key: "cuda" value: "11020" } environment { key: "cudnn" value: "8100" } num_registers: 65536 l1_cache_size: 24576 l2_cache_size: 6291456 shared_memory_size_per_multiprocessor: 98304 memory_size: 32503234560 bandwidth: 898048000 } outputs { dtype: DT_HALF shape { unknown_rank: true } } 2022-03-16 14:29:03.176784: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1979] Converted 0/9388 nodes to float16 precision using 0 cast(s) to float16 (excluding Const and Variable casts) 2022-03-16 14:29:05.830962: W tensorflow/core/framework/op_kernel.cc:1680] Invalid argument: required broadcastable shapes Traceback (most recent call last): File "train_fastspeech2_libritts.py", line 498, in <module> main() File "train_fastspeech2_libritts.py", line 490, in main resume=args.resume, File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 1010, in fit self.run() File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 104, in run self._train_epoch() File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 126, in _train_epoch self._train_step(batch) File "/ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 782, in _train_step self.one_step_forward(batch) File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 885, in __call__ result = self._call(*args, **kwds) File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 950, in _call return self._stateless_fn(*args, **kwds) File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3040, in __call__ filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat ctx, args, cancellation_manager=cancellation_manager)) File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 596, in call ctx=ctx) File "/root/anaconda3/envs/tf_tts/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute inputs, attrs, num_outputs) tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: required broadcastable shapes [[node tf_fast_speech2/add_1 (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech2.py:185) ]] [[tf_fast_speech2/length_regulator/while/loop_body_control/_117/_139]] (1) Invalid argument: required broadcastable shapes [[node tf_fast_speech2/add_1 (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech2.py:185) ]] 0 successful operations. 0 derived errors ignored. [Op:__inference__one_step_forward_33805] Errors may have originated from an input operation. Input Source operations connected to node tf_fast_speech2/add_1: tf_fast_speech2/encoder/layer_._3/mul (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech.py:413) Input Source operations connected to node tf_fast_speech2/add_1: tf_fast_speech2/encoder/layer_._3/mul (defined at /ultra/fffan/0_TTS/TensorFlowTTS/TensorFlowTTS/tensorflow_tts/models/fastspeech.py:413) Function call stack: _one_step_forward -> _one_step_forward [train]: 0%| | 0/150000 [01:05<?, ?it/s]
Does somebody know how to solve it ?
I get this error when train data
Libritts
( Multispeaker )Does somebody know how to solve it ?