Open rishikksh20 opened 4 months ago
Hi @rishikksh20 , I still face the problem and I am still working on to fix it. When I fix this problem, I will upload the samples.
Hi What kind pf problem you are facing I might be help you ?
Thank you very much! I found it difficult to generate the comprehensible speech from the transducer model. I have made sure the T2S model works normally. I speculate that the transducer model didn't converge well. No matter if I use k2.rnnt_loss_simple or k2.rnnt_loss_smoothed, the model is always slow to converge, and even after 3 or 4 days of training, the loss function per sample is still high on average (100 or so), I'm not quite sure if this phenomenon is normal. I haven't trained a transducer before, It would be great if you could provide some advice on training transducer models!
Hi @scutcsq , saw your repo and tracking your training, have you able to generate some good quality speech ? Please share some samples or pretrained model if possible. Thanks for code.