Closed liuhuang31 closed 11 months ago
Which SLM model are you using?
microsoft/wavlm-base-plus
Yes it won't work because it's an English model. See #70
@liuhuang31 Hi, How did you handle this SLM issue? Also, I found that the styletts2 audio shared by you in #139 sound good, is there anything different with here?
@jarred1989 Hi, jarred1989. For the origin code slmadv using differentiable duration modeling, i cant got a good result, its seems not helpful for me. So i not use it, and change to the normal duration modeling as before.
got it! thx
I use chinese data to train and remove pl-bert mudule. It is normal until training to stage2 joint train, which train slmadv using differentiable duration. This causes the model to collapse and there are problems with the synthesized audio pronunciation.