Open SimonDemarty opened 6 months ago
I've fixed this in the following commit: https://github.com/martinambrus/StyleTTS2/commit/bf7ea7172d83db9c0e5b414fdf5e304aa3f4b848
Feel free to update those 4 files if you need this fix as well. This repo owner is currently inactive and pull requests don't seem to be reviewed anymore, so I didn't really care to create one.
Hello there,
First of all, thank you for the great model! I noticed something strange while finetuning the model. Indeed, it seems that resuming finetuning actually resumes 1 epoch before the one specified.
Replicate
i finetuned the model using:
accelerate launch --mixed_precision=fp16 --num_process=1 train_finetune_accelerate.py --config_path <path/to/config.yml>
So far, everything went well, until the finetuning crashed (this was to be expected with the parameters I chose). The last epoch I used before it crashed was the [11/100]:
At this point, as I am currently working on 11th epoch, the last completed one was the 10th epoch, saved as
epoch_2nd_00009.pth
Then, when I went to modify the config.yml, I set the following parameters to:
back in the terminal, I rerun the command:
accelerate launch --mixed_precision=fp16 --num_process=1 train_finetune_accelerate.py --config_path <path/to/config.yml>
what is now displayed is:
I waited a bit to see that no epoch were stored. I am assuming it stored
epoch_2nd_00009.pth
.Conclusion
It means that resuming finetuning probably actually uses the right epoch (the one I linked in the config.yml), but resumes finetuning under the wrong number (ie. 10 instead of 11). Thus, it might also uses the wrong parameters (diff epoch=10 so would be used on [11/100] if I understand well)