Closed psalajka closed 2 years ago
Update: The debugging tool
while True:
for data in tqdm(train_dataset):
# debug 1 step forward
try:
outputs = fastspeech2(**data)
except:
print(data["utt_ids"])¨
for data in tqdm(valid_dataset):
# debug 1 step forward
try:
outputs = fastspeech2(**data)
except:
print(data["utt_ids"])
generated after several (many) iterations some results. I already identified 3 samples that appeared at least 2 times in outputs and their removal increased training stability. I managed to make more than 6000 steps.
Unfortunately, I still don't understand what's happening. Now I try to fix the seed to achieve reproducibility.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
Hello, I'm trying to train a phoneme-based FS2 model. Our dataset has the same structure (and settings) as the LJ Speech Dataset, therefore I use its configs. We're generating our own durations, so I triple-checked everything.
This issue is probably closely related to https://github.com/TensorSpeech/TensorFlowTTS/issues/518 and https://github.com/TensorSpeech/TensorFlowTTS/issues/512, for that I tried everything mentioned there.
Checking data with
didn't print any error (neither for
validation_dataset
). I put that code atexamples/fastspeech2/train_fastspeech2.py:400
(before# start training
).My main problem is the training makes many steps before it dies (3627 steps last time). That means that several epochs were done. I attached details of the error in error.txt. Please note that some line numbers are a little different (due to debug prints), e.g. in
188 should be 185. The difference is always very small, usually 2 steps as above.
I already realized the error line is:
The
last_encoder_hidden_states
shape is different fromf0_em...
andenergy_em...
shapes (which are the same).Please, do you have an idea what could be wrong? I'm running out of mine... Thanks