Open max-padgett opened 3 years ago
Thanks for your interest. How many steps do you want to train the model? In our experiments, it took about 2 weeks to train the model up to 2.5M steps with two V100. I would like to suggest 2 ideas. 1) The model can achieve sufficient quality before reaching the target training step. Therefore, it would be a good idea to check the quality repeatedly before reaching the target step. 2) We provide discriminator weights for the universal model. It would be a good idea to conduct transfer learning with the weights.
Thank you for replying quickly. I'm doing a project based on modifying hifigan. Unfortunately I'm strapped for time and didn't take into account the training time involved. So two questions.
It may vary depending on the difference between the original training data and the training data for transfer learning, so it would be a good idea to check it through experimentation. For reference, in our experiments we found that a model fine-tuned up to 100k steps synthesizes good quality audio. Since this is the case of fine tuning for the same speaker, it may differ from your experimental conditions.
I'm trying to just run train.py with google colab pro (V100 16GB GPU). According to the amount of time it takes per epoch it should take 11ish days (not possible with google colab) to train. Does this sound about right for this model or is there some kind of major bottleneck somewhere? Sorry for the stupid question, I've been trying to speed it up and just want some kind of comparison.