why Duration adverse loss don't update

p0p4k / vits2_pytorch

unofficial vits2-TTS implementation in pytorch

https://arxiv.org/abs/2307.16430

MIT License

471 stars 84 forks source link

why Duration adverse loss don't update #3

Closed jiahong3837 closed 1 year ago

jiahong3837 commented 1 year ago

why Duration adverse loss don't update? the discrimatepredicton is compelete

p0p4k commented 1 year ago

I am just thinking if LSTM discriminator is right or some other transformer based discriminator is the way to go. It shouldnt really matter a lot, but just waiting on some feedback. Plus, if you think this LSTM discriminator is ok, please attach it to models.py and change lines in train.py and send a PR, it will be really helpful!

elch10 commented 1 year ago

In original paper adversarial loss enabled only after model trained for 800k steps. And this adversarial loss updates duration predictor for 30k steps.

p0p4k commented 1 year ago

@elch10 I think the duration predictor is trained for the first 30k steps and then frozen. The overall waveform generation model is trained continuously for 800k steps.

elch10 commented 1 year ago

I found such explanation in paper Our proposed duration predictor and training mechanism allow for a learning duration in short steps, and the duration predictor is separately trained as the last training step, which reduces the overall computation time for training.

p0p4k commented 1 year ago

You are correct. Section 3 says -The networks to generate waveforms and the duration predictor were trained up to 800k and 30k steps, respectively. So, they train the adversarial duration in the final 30k steps. So that means we can train this model with the original stochastic duration predictor right away.

p0p4k commented 1 year ago

I have added durations predictor to train all the way during the training. Will fix to follow the paper's style soon.

JohnHerry commented 9 months ago

I have added durations predictor to train all the way during the training. Will fix to follow the paper's style soon.

The last 30K steps, will we just train the duration predictor with DD (duration discriminator)? or train the whole model with DD?

jiahong3837 commented 9 months ago

你好，我是邱家洪，邮件已经收到，谢谢。