How to make use of GRL for speaker

gtonkov commented 1 year ago

Hi,

How can I use the SpeakerClassifier in vits/modules_grl.py?

It was added with this commit but I cannot see this to be used anywhere during training.

Thanks!

MaxMax2016 commented 1 year ago

https://github.com/PlayVoice/so-vits-svc-5.0/blob/main/vits/modules_grl.py it will be used to finetune model, after base model trained. I still train base model, now.

this code will be added for the finetune

x = self.enc(x * x_mask, x_mask)

-- Speaker Classifier speaker_posterior = self.speaker_classifier(x) LOSS self.nll_loss = nn.NLLLoss() adv_loss = self.nll_loss(speaker_posteriors, speaker_targets[序号]):

but use speaker vector for loss

more info for GRL can be finded here:https://github.com/ubisoft/ubisoft-laforge-daft-exprt/blob/master/src/daft_exprt/model.py

MaxMax2016 commented 1 year ago

At the finetune stage, PosteriorEncoder should be frozen, if you want to try GRL now.

gtonkov commented 1 year ago

great, thanks :)

PlayVoice / whisper-vits-svc

How to make use of GRL for speaker #19