sony / sqvae

Pytorch implementation of stochastically quantized variational autoencoder (SQ-VAE)
Apache License 2.0
179 stars 21 forks source link

Unable to generate speech #3

Closed aijianiula0601 closed 2 years ago

aijianiula0601 commented 2 years ago

Thank you for your great job!

It can not genertate speech when I finish training your speech model. I trained your speech model to mel and trained the hifigan as vocoder for mel to wav. The hifigan generate audio without problems. Could you please provide the speech training model? Thanks!

TakashiShibuyaSony commented 2 years ago

Thank you for having an interest in our work!

We're sorry, but we are not planning to upload trained models. And, the focus of our paper is not on achieving SOTA quality but on comparing our method, SQ-VAE, with conventional VQ-VAE. For comparison, we used a simple architecture rather than sophisticated ones in experiments. The fidelity of speech generated from the implemented model is not satisfactory, anyway.