NTT123 / light-speed

A modified VITS that utilizes phoneme duration's ground truth for better robustness
MIT License
112 stars 34 forks source link