Closed fakerybakery closed 20 hours ago
The compared samples are from original E2 TTS demo page https://aka.ms/e2tts/ We have reproduced a multilingual E2 TTS on Emilia_ZH_EN (a public in-the-wild dataset), which is the ckpt we released.
By setting
ode_method = 'midpoint'
sway_sampling_coef = 0.
you will obtain a vanilla E2 TTS, though these methods (sway_sampling for better performance, with euler for speed-up) are also beneficial for E2 TTS, or say (all) CFM-based model.
Thanks for clarifying!
Hi, Thanks for releasing F5-TTS! I noticed on the demo page you listed some comparisons from E2-TTS, and in your inference code you have an option to load an E2-TTS checkpoint. Would you mind sharing which E2-TTS checkpoint you’re using? Thanks!