PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
The current implementation is not trained in a semi-supervised way due to the small dataset size. But it can be easily activated by specifying target speakers and passing no emotion ID with no emotion classifier loss.
This is the Notes you writed in ReadME
Can i know about the "specifying target speakers and passing no emotion ID with no emotion classifier loss", how to use in your code?
The current implementation is not trained in a semi-supervised way due to the small dataset size. But it can be easily activated by specifying target speakers and passing no emotion ID with no emotion classifier loss. This is the Notes you writed in ReadME Can i know about the "specifying target speakers and passing no emotion ID with no emotion classifier loss", how to use in your code?