GANtastic3 / MaskCycleGAN-VC

Implementation of Kaneko et al.'s MaskCycleGAN-VC model for non-parallel voice conversion.
MIT License
111 stars 31 forks source link

decay_after and identity_loss stopping parameters are different #5

Closed terbed closed 3 years ago

terbed commented 3 years ago

The original article says:

We used L_id only for the first 1e4 iterations to guide the learning direction. 
We kept the same learning rate for the first 2e5 iterations and linearly decay over the next 2e5 iterations.

But in the code decay_after = 1e4 which also controls the identity constraint removal.

I corrected this issue with adding a new argument: stop_identity_after

There are also minor updates with no effects and an experimental update regarding a spectral norm in the discriminator model which is to be discarded.

hikaruhotta commented 3 years ago

Hi @terbed, thank you for making a PR! Please give us a couple of days to run training on your branch and to review this PR. You raised an issue regarding the quality of the outputs. Ddi adjusting decay after and identity loss params make a difference in the quality of outputs for the same pair of speakers

terbed commented 3 years ago

I did a comparative run on the weekend but the modifications not resulted in much improvement in that case, unfortunately. The changes might have only minor effects.