jik876 / hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
MIT License
1.92k stars 506 forks source link

Training times v1 vs. v2 vs. v3? #21

Closed serg06 closed 3 years ago

serg06 commented 3 years ago

Hello, I see it took you this amount of time to train v1:

It took about 13-14 days to train the model up to 2,500k steps with two V100 GPUs.

Since v2 and v3 have faster inference, does that mean training them would be faster too?

jik876 commented 3 years ago

Thanks for your interest! Yes. Although V2 and V3 are trained faster than V1, there is no difference as much as the difference in inference speed because the discriminator takes longer during training. When training V3, the training time is shortened by approximately 13~14% compared to V1.

serg06 commented 3 years ago

Awesome, thank you @jik876!