Hi! Thank u for making this amazing project public!
I just want a guidance for a problem I meet when tuning this code. The confusion is that why the vae decoder cannot recover the original speech wav when I directly use the latent code extracted by the provided encoder as the input?
Hi! Thank u for making this amazing project public! I just want a guidance for a problem I meet when tuning this code. The confusion is that why the vae decoder cannot recover the original speech wav when I directly use the latent code extracted by the provided encoder as the input?