p0p4k / pflowtts_pytorch

Unofficial implementation of NVIDIA P-Flow TTS paper
https://neurips.cc/virtual/2023/poster/69899
MIT License
198 stars 28 forks source link

Compare to Vits2 #43

Open HuuHuy227 opened 2 months ago

HuuHuy227 commented 2 months ago

Any experiment give the result of pflow compare to vits2. Which is better?

JohnHerry commented 1 week ago

Not sure. I tried on mandarin dataset. VITS2 is not good enough in speech pause and prosody, but pflow trained result is even worse. Is that the bad from "duration predictor and expension"?