TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.8k stars 810 forks source link

shape error in trainning of fastspeech2 on new dataset #720

Closed MostafaAlaviyan closed 2 years ago

MostafaAlaviyan commented 2 years ago

hi thanks to your valuable repo! I want to train fastspeech2 on a Persian dataset and extract durations from MFA. after preprocessing, when I began the train the bellow error occurred:

Traceback (most recent call last):
  File "examples/fastspeech2/train_fastspeech2.py", line 417, in <module>
    main()
  File "examples/fastspeech2/train_fastspeech2.py", line 409, in main
    resume=args.resume,
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow_tts/trainers/base_trainer.py", line 999, in fit
    self.run()
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow_tts/trainers/base_trainer.py", line 103, in run
    self._train_epoch()
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow_tts/trainers/base_trainer.py", line 125, in _train_epoch
    self._train_step(batch)
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow_tts/trainers/base_trainer.py", line 777, in _train_step
    self.one_step_forward(batch)
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
    cancellation_manager=cancellation_manager)
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/function.py", line 550, in call
    ctx=ctx)
  File "/root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  Incompatible shapes: [16,153,384] vs. [16,104,384]
     [[node gradients/tf_fast_speech2/add_1_grad/BroadcastGradientArgs (defined at /root/anaconda3/envs/tts/lib/python3.6/site-packages/tensorflow_tts/trainers/base_trainer.py:810) ]] [Op:__inference__one_step_forward_33274]

Function call stack:
_one_step_forward

[train]:   0%|            

how can I solve the error?

MostafaAlaviyan commented 2 years ago

I check the preprocessed items and I found that:

ids = np.load("./ids/Excel1-sheet7-08-ids.npy")
print(len(ids))
d =np.sum(np.load("./durations/Excel1-sheet7-08-durations.npy"))
print(d)
energy = np.load("./raw-energies/Excel1-sheet7-08-raw-energy.npy")
print(len(energy))
f0 = np.load("./Excel1-sheet7-08-raw-f0.npy")
print(len(f0))
norm = np.load("./Excel1-sheet7-08-norm-feats.npy")
print(len(norm))

the results are as follows:

132
799
799
799
799
dathudeptrai commented 2 years ago

@MostafaAlaviyan you need to feed a Fastspeech model to every samples to detect what samples caused this problem by the code bellow:

for d in dataset:
    outputs = fastspeech2(**d, training=True)
    print("PRINT WHAT YOU WANT HERE IF THERE IS A BUG")

note that you should set batch_size is 1.