Open Siraj-HM opened 3 weeks ago
You can record a surprised / sad speech and do them individually. It only needs a short input speech to work. Which TTS does those curly brackets?
You can record a surprised / sad speech and do them individually. It only needs a short input speech to work. Which TTS does those curly brackets?
https://huggingface.co/spaces/mrfakename/E2-F5-TTS from the HF demo page.
On that page you have to manually upload the audio and click "Insert" with "Surprised", "Sad", etc.
I'm thinking, it could be done here if we record files like voicename.emotion1.mp3, voicename.emotion2.mp3
I've made an update with instructions for multiple voices. Run "git update" in the custom_node folder or reinstall.
I've made an update with instructions for multiple voices. Run "git update" in the custom_node folder or reinstall.
Thank you will test it out. Did you check about the emotions in voice.
The emotions don't come automatically if that's what you're expecting. In the gradio app on hugging face demo you have to upload each emotion's sample voice. The multiple voices work like emotions.
The emotions don't come automatically if that's what you're expecting. In the grario app on hugging face demo you have to upload each emotion's sample voice. The multiple voices work like emotions.
Thanks interesting implementation. And is there anyway I can download the models and keep it instad of downloading them everytime from hf ?
What models are you downloading? The samples you can keep them as .wav, .txt files in the input folder.
er
vocab : /workspace/ComfyUI/custom_nodes/ComfyUI-F5-TTS/F5-TTS/data/Emilia_ZH_EN_pinyin/vocab.txt tokenizer : custom model : /root/.cache/huggingface/hub/models--SWivid--F5-TTS/snapshots/995ff41929c08ff968786b448a384330438b5cb6/F5TTS_Base/model_1200000.safetensors
text:Amid the gentle hum of the city, a single raindrop fell, marking the start of a quiet storm. Somewhere in the distance, a clock ticked, echoing the passage of time. No voice tag found, using main. Voice: main text:Amid the gentle hum of the city, a single raindrop fell, marking the start of a quiet storm. Somewhere in the distance, a clock ticked, echoing the passage of time. gen_text 0 Amid the gentle hum of the city, a single raindrop fell, marking the start of a quiet storm. Somewhere in the distance, a clock ticked, echoing the passage of time. Generating audio in 1 batches... 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.34s/it] Prompt executed in 9.02 seconds
The cache which is saved from HF
The model is in .cache? Doesn't look like it's downloading it? Would have been downloaded during the first run or installation.
The model is in .cache? Doesn't look like it's downloading it? Would have been downloaded during the first run or installation.
I am using runpod so it does it during first run since runpod kills the system storage everytime
Sorry, I haven't used runpod. Does it keep checkpoints in between every run?
F5-TTS uses cached_path. https://pypi.org/project/cached-path/
You can tell it where to put it's files by setting the environment before running python main.py
in comfyui...
export CACHED_PATH_CACHE_ROOT="<folder where you want to put the model>"
Sorry, I haven't used runpod. Does it keep checkpoints in between every run?
F5-TTS uses cached_path. https://pypi.org/project/cached-path/
You can tell it where to put it's files by setting the environment before running
python main.py
in comfyui...export CACHED_PATH_CACHE_ROOT="<folder where you want to put the model>"
Yes. Can I download the model to Comfyui/models/F5/ and use it as other checkpoints?
Try starting ComfyUI like...
export CACHED_PATH_CACHE_ROOT="ComfyUIPath/models/F5"
python3 main.py
Change "ComfyUIPath" to where you have installed it.
Hey. Thanks for the node. How do I add emotion in TTS? When I tried {sad} followed by some text. I read the word {sad} too.