p0p4k / pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper
https://neurips.cc/virtual/2023/poster/69899
MIT License
198 stars 28 forks source link

Noise in e2e branch #19

Closed Tera2Space closed 6 months ago

Tera2Space commented 6 months ago

I'm trying to train e2e branch but the result is only noise in the audio, am I doing something wrong or is this version not ready yet? Basically hifigan out is just -1 tensor, so maybe i made mistake somewhere.

p0p4k commented 6 months ago

Not completely ready yet. I feel like that training will not work in single stage. It has to be done in 2 stages like naturalspeech2. Train a model to get latents and then use those latents as target for pflow. Thats why I made an encodec based branch which has pretrained latents.

Tera2Space commented 6 months ago

Thanks for response, got it :)