declare-lab / tango

A family of diffusion models for text-to-audio generation.
https://tango2-web.github.io/
Other
991 stars 79 forks source link

The vae decoder cannot recover original audio with the extracted latent code #30

Open ikm565 opened 1 year ago

ikm565 commented 1 year ago

Hi! Thank u for making this amazing project public! I just want a guidance for a problem I meet when tuning this code. The confusion is that why the vae decoder cannot recover the original speech wav when I directly use the latent code extracted by the provided encoder as the input?

0417keito commented 1 year ago

The VAE decoder only restores the mel spectrogram, it is the Vocoder that restores the wav.