Closed MuruganR96 closed 11 months ago
What works in this project is PPG Perturbation. Here, GRL model is too small to be useful.
Thank you @MaxMax2016
One follow up question
Why you did MIX encoder with hubert SSL + whisper ppg? Along whisper is not good?
What is your suggestions about Whisper PPG for cross-lingual voice conversion?
Whisper PPG is not so good for cross-lingual voice conversion, so MIX encoder with hubert SSL + whisper ppg is used.
@MaxMax2016 I want to introduce GRL speaker classification loss in training.
Do I need to introduce it at the initial training phase or some interval after ( like 100k)?
What is your suggestion? please guide me @MaxMax2016 :)
Hi @MaxMax2016 thank you so much for this wonderful project
I need your guidance on improving speaker similarity on So-VITS-SVC 4.1 stable branch.
I saw you introduced many approaches to reduce timbre to improve speaker similarity.
Which one is better to improve similarity?
Please help me @MaxMax2016