Labmem-Zhouyx / CDFSE_FastSpeech2

The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”
MIT License
81 stars 12 forks source link

After training, PhnCls Loss and SpkCls Loss do not decay. Reasoning timbre is not the same #3

Open Summerxu86 opened 2 years ago

Summerxu86 commented 2 years ago

After training, PhnCls Loss and SpkCls Loss are both very high with almost no attenuation. When reasoning, given the target speaker's voice, but the timbre is not the same. You have ensured that weight is 1 in config/model.yaml and that use_spkcls: True. Is the classifier not working? What should be done about it? I am very distressed and urgently need your help

liubc-ai commented 10 months ago

This problem also occurred when I reproduced it. Is there any follow-up solution?