niknah / ComfyUI-F5-TTS

ComfyUI node for F5-Text To Speech
MIT License
33 stars 4 forks source link

Request: Add option to use the E2 model #9

Open valandyr opened 2 weeks ago

valandyr commented 2 weeks ago

After some testing in gradio, i found that some voices(anger/yell) works best with the E2 model. Tried to change the repo_name and exp_name variables to E2. It downloads the model then execution is borked.

valandyr commented 2 weeks ago

Also when using midpoint vs euler it results in considerably less artifacts, although it runs slower. Having those parameters exposed will help in some cases.

niknah commented 2 weeks ago

If you want to try other options, ODE method, relative_tolerance, absolute_tolerance. Use this branch... https://github.com/niknah/ComfyUI-F5-TTS#2024_11_test_params

But I did not hear much difference with those settings.

I found out that F5-TTS is not using a seed so you get a random new speech every time you run it. Changing the seed made a lot of difference. I have added a seed value there. Attached example workflow. workflow_f5tts_seed_test.json