Closed aijianiula0601 closed 2 years ago
Thank you for having an interest in our work!
We're sorry, but we are not planning to upload trained models. And, the focus of our paper is not on achieving SOTA quality but on comparing our method, SQ-VAE, with conventional VQ-VAE. For comparison, we used a simple architecture rather than sophisticated ones in experiments. The fidelity of speech generated from the implemented model is not satisfactory, anyway.
Thank you for your great job!
It can not genertate speech when I finish training your speech model. I trained your speech model to mel and trained the hifigan as vocoder for mel to wav. The hifigan generate audio without problems. Could you please provide the speech training model? Thanks!