Closed godspirit00 closed 2 years ago
This bug comes from data, not a model itself. I think you need to check what samples cause this problem by the code below:
for data in tqdm(train_dataloader):
o = fastspeech(**data, training=True)
Note that you need to set batch_size = 1 so you can easily check what is a real problem with those samples :D
@dathudeptrai Sorry for the late reply.
I inserted the code you provided in train_fastspeech2.py
after Line 366, and print(o)
. The output is as follows:
{'utt_ids': <tf.Tensor: shape=(16,), dtype=string, numpy=
array([b'HIPS-0848-08', b'SLAT297-004-03', b'SCIENCE-15354-01',
b'NYT031-040-01', b'TIM_918', b'TIM_663', b'HIPS-0544-01',
b'TIM_960', b'WAOPF-0323-03', b'ARC_087', b'WKRWTA-0320-00',
b'RURAL-04942', b'TIM_366', b'YOYT-0106-01', b'SCIENCE-14201',
b'SLAT059-004-04'], dtype=object)>, 'input_ids': <tf.Tensor: shape=(16, 129), dtype=int32, numpy=
array([[60, 46, 57, ..., 0, 0, 0],
[60, 42, 11, ..., 0, 0, 0],
[56, 52, 11, ..., 0, 0, 0],
...,
[52, 49, 41, ..., 0, 0, 0],
[57, 52, 11, ..., 0, 0, 0],
[57, 45, 42, ..., 0, 0, 0]], dtype=int32)>, 'speaker_ids': <tf.Tensor: shape=(16,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)>, 'duration_gts': <tf.Tensor: shape=(16, 92), dtype=int32, numpy=
array([[15, 7, 4, ..., 0, 0, 0],
[14, 6, 5, ..., 0, 0, 0],
[ 3, 16, 4, ..., 0, 0, 0],
...,
[10, 8, 5, ..., 0, 0, 0],
[ 8, 7, 3, ..., 0, 0, 0],
[ 9, 5, 6, ..., 0, 0, 0]], dtype=int32)>, 'f0_gts': <tf.Tensor: shape=(16, 92), dtype=float32, numpy=
array([[ 0. , 0.8246144 , 1.2876985 , ..., 0. ,
0. , 0. ],
[-0.5796331 , 0. , 1.1341187 , ..., 0. ,
0. , 0. ],
[ 0. , 1.0941867 , 0. , ..., 0. ,
0. , 0. ],
...,
[ 0. , 0.449585 , 0.8970453 , ..., 0. ,
0. , 0. ],
[-0.34129396, -0.614922 , -0.27131552, ..., 0. ,
0. , 0. ],
[ 0.45157978, 0. , 0.76424146, ..., 0. ,
0. , 0. ]], dtype=float32)>, 'energy_gts': <tf.Tensor: shape=(16, 92), dtype=float32, numpy=
array([[-1.2123926 , -0.6892038 , 0.8126763 , ..., 0. ,
0. , 0. ],
[-0.354727 , 0.01240734, 1.5514941 , ..., 0. ,
0. , 0. ],
[-1.2133728 , 0.10915235, -0.7754631 , ..., 0. ,
0. , 0. ],
...,
[-1.2078284 , 0.4109656 , 1.9860119 , ..., 0. ,
0. , 0. ],
[-1.0766432 , 0.08155485, 0.65240884, ..., 0. ,
0. , 0. ],
[-0.27268156, -0.24169774, 2.7491963 , ..., 0. ,
0. , 0. ]], dtype=float32)>, 'mel_gts': <tf.Tensor: shape=(16, 633, 80), dtype=float32, numpy=
array([[[-1.2746664 , -1.7843498 , -2.2586834 , ..., -1.2931284 ,
-1.3889788 , -1.3783925 ],
[-1.0166491 , -1.6397812 , -2.181965 , ..., -1.204765 ,
-1.1647713 , -1.1934797 ],
[-0.86630285, -1.4328362 , -1.5542951 , ..., -1.2119982 ,
-1.2231145 , -1.2497594 ],
...,
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ]],
[[-2.3355176 , -2.289797 , -2.6863523 , ..., -1.8296297 ,
-2.0163133 , -1.9705946 ],
[-2.2332492 , -2.1360862 , -2.3810003 , ..., -1.7833395 ,
-1.8442103 , -1.9748284 ],
[-0.48657155, -0.52131754, -0.72493863, ..., -1.6943804 ,
-1.8081826 , -1.9736167 ],
...,
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ]],
[[-1.8014201 , -2.0903645 , -2.1711755 , ..., -1.5921981 ,
-1.4874189 , -1.3082155 ],
[-1.6379825 , -1.8543022 , -2.1112792 , ..., -1.4420213 ,
-1.4051898 , -1.2607079 ],
[-1.2559214 , -1.7089903 , -1.9510512 , ..., -1.334813 ,
-1.3849427 , -1.3038315 ],
...,
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ]],
...,
[[-1.4703877 , -1.5131235 , -1.4239932 , ..., -1.3939542 ,
-1.400334 , -1.6912216 ],
[-1.3440789 , -1.6951889 , -1.3633583 , ..., -1.4433973 ,
-1.4994305 , -1.5762591 ],
[-1.5572135 , -1.6220838 , -1.5402904 , ..., -1.513843 ,
-1.589731 , -1.5659026 ],
...,
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ]],
[[-2.2110486 , -2.533403 , -2.4404142 , ..., -2.0414464 ,
-1.8681637 , -1.7730412 ],
[-2.250254 , -2.4695659 , -2.6156392 , ..., -1.9038972 ,
-1.6976963 , -1.8304156 ],
[-1.721149 , -1.5887836 , -1.5241948 , ..., -1.3508788 ,
-1.4653685 , -1.406599 ],
...,
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ]],
[[-2.6205416 , -2.9442687 , -2.5861807 , ..., -1.725173 ,
-2.025198 , -1.8375293 ],
[-0.2048404 , -0.3904849 , -0.5025313 , ..., 0.43135187,
0.47107655, -0.25049213],
[ 0.48289043, 0.17719266, 0.39419013, ..., 1.0222603 ,
1.0808944 , 0.29906785],
...,
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[ 0. , 0. , 0. , ..., 0. ,
0. , 0. ]]], dtype=float32)>, 'mel_lengths': <tf.Tensor: shape=(16,), dtype=int32, numpy=
array([477, 163, 223, 597, 141, 203, 633, 345, 283, 227, 489, 555, 283,
261, 553, 439], dtype=int32)>}
2022-01-10 19:05:12.660309: W tensorflow/core/framework/op_kernel.cc:1680] Invalid argument: required broadcastable shapes
0it [00:52, ?it/s]
Traceback (most recent call last):
File "examples/fastspeech2/train_fastspeech2.py", line 424, in <module>
main()
File "examples/fastspeech2/train_fastspeech2.py", line 370, in main
o = fastspeech(**data, training=True)
File "/home/tony/.virtualenvs/tftts/lib/python3.8/site-packages/keras/engine/base_layer.py", line 1037, in __call__
outputs = call_fn(inputs, *args, **kwargs)
File "/home/tony/Documents/TensorFlowTTS/tensorflow_tts/models/fastspeech2.py", line 185, in call
last_encoder_hidden_states += f0_embedding + energy_embedding
File "/home/tony/.virtualenvs/tftts/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1367, in binary_op_wrapper
return func(x, y, name=name)
File "/home/tony/.virtualenvs/tftts/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
return target(*args, **kwargs)
File "/home/tony/.virtualenvs/tftts/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 1700, in _add_dispatch
return gen_math_ops.add_v2(x, y, name=name)
File "/home/tony/.virtualenvs/tftts/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 455, in add_v2
_ops.raise_from_not_ok_status(e, name)
File "/home/tony/.virtualenvs/tftts/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6941, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: required broadcastable shapes [Op:AddV2]
So how do I solve the problem?
Thanks a lot!
@godspirit00 Hellow, Did you solve this problem ? I get same error ,but I don't know how to solve it.
@Tian14267 not yet. I'm still waiting for @dathudeptrai 's reply.
Have you solved this problem?@godspirit00 @dathudeptrai
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
I was trying to train Fastspeech 2 on the Nancy Corpus. I extracted the durations with MFA, and did preprocessing as described in the README. But when I start training, I met the following error:
This is exactly the same issue as #672 . That issue was labeled as "Bug" and it looks like it was not solved?
So what can I do to get the training started?
Thanks a lot!