Closed EvarDion closed 9 months ago
I managed to get the config to load properly by adding the following code at line 105 of TTS/server/server.py
#Check the model path for a config file if none is supplied.
if config_path is None:
print("looking for config in: ", model_path)
model_config_path = os.path.join(model_path, "config.json")
print("model_config_path:", model_config_path)
if os.path.exists(model_config_path):
config_path = model_config_path
UPDATE: The Web UI does not load the speaker IDs for xtts_v2, bark and tortoise-v2 so I guess this is a feature that is still a work in progress.
TEMPORARY FIX: (How To Call xtts_v2 with a http Get request)
Example Get Request:
http://[::1]:5002/api/tts?text=Hello%20how%20are%20you%20today.%20I%20am%20a%20robot.%20How%20may%20I%20help%20you%3F&speaker_id=Daisy%20Studious&style_wav=&language_id=en
Command for listing Speaker ids.
tts --list_speaker_idxs --model_name tts_models/multilingual/multi-dataset/xtts_v2
Same problem.
python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/xtts_v2
> tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
Traceback (most recent call last):
File "/root/TTS/server/server.py", line 104, in <module>
synthesizer = Synthesizer(
File "/root/TTS/utils/synthesizer.py", line 93, in __init__
self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
File "/root/TTS/utils/synthesizer.py", line 183, in _load_tts
self.tts_config = load_config(tts_config_path)
File "/root/TTS/config/__init__.py", line 82, in load_config
ext = os.path.splitext(config_path)[1]
File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType
Same problem.
If you want to run the server you just need to edit the server.py and manually apply the code fix for it in my comment above but the UI does not list the speaker_ids so you will need to construct a http get request on your own if you want to hear all the voices. (see examples above).
Temporary fix works for me.
Any ideas if voice cloning (--speaker_wav /path/to/sample.wav) also works via tts-server? If yes and we can successfully use this parameter in the GET-request, where should the sample.wav be stored?
Any ideas if voice cloning (--speaker_wav /path/to/sample.wav) also works via tts-server? If yes and we can successfully use this parameter in the GET-request, where should the sample.wav be stored?
Sorry have not tried it yet.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.
Same issue python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/xtts_v2
tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded. Traceback (most recent call last): File "/root/TTS/server/server.py", line 104, in
synthesizer = Synthesizer( File "/root/TTS/utils/synthesizer.py", line 93, in init self._load_tts(tts_checkpoint, tts_config_path, use_cuda) File "/root/TTS/utils/synthesizer.py", line 183, in _load_tts self.tts_config = load_config(tts_config_path) File "/root/TTS/config/init.py", line 82, in load_config ext = os.path.splitext(config_path)[1] File "/usr/local/lib/python3.10/posixpath.py", line 118, in splitext p = os.fspath(p) TypeError: expected str, bytes or os.PathLike object, not NoneType on MacBook M1 Pro
Same issue
hey,
After downloading the model with the command below, rerun the command with model_path and config_path.
python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/xtts_v2
python3 TTS/server/server.py \
--model_path ~/.local/share/tts/tts_models/multilingual/multi-dataset/xtts_v2 \
--config_path ~/.local/share/tts/tts_models/multilingual/multi-dataset/xtts_v2/config.json
In the latest docker (ghcr.io/coqui-ai/tts-cpu
from 2024-09-01) the paths changed, i.e. now run
tts-server \
--model_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2 \
--config_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/config.json
Yeah, needs speaker "json" now
# docker
docker run --name coqui --rm -it -p 5002:5002 --gpus all -v ./tts:/root/.local/share --entrypoint /bin/bash ghcr.io/coqui-ai/tts
tts-server \
--model_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2 \
--config_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/config.json \
--speakers_file_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/speakers_xtts.pth \
--use_cuda true
or docker compose:
services:
coqui:
container_name: coqui
image: ghcr.io/coqui-ai/tts
build:
context: ./TTS
ports:
- 5002:5002
environment:
- COQUI_TOS_AGREED=1
#entrypoint: ["python3", "TTS/server/server.py", "--model_name", "tts_models/multilingual/multi-dataset/xtts_v2", "--use_cuda", "true"]
entrypoint:
- "/bin/bash"
- "-c"
- "tts-server --model_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2 --config_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/config.json --speakers_file_path ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/speakers_xtts.pth --use_cuda true"
volumes:
- ./tts:/root/.local/share
deploy:
resources:
reservations:
devices:
- count: all # alternatively, use `count: all` for all GPUs
capabilities: [gpu]
Then you can
http://[::1]:5002/api/tts?text=Hello%20how%20are%20you%20today.%20I%20am%20a%20robot.%20How%20may%20I%20help%20you%3F&speaker_id=Daisy%20Studious&style_wav=&language_id=en
This URL uses "Daisy Studious" but the list is:
[
'Claribel Dervla', 'Daisy Studious', 'Gracie Wise',
'Tammie Ema', 'Alison Dietlinde', 'Ana Florence',
'Annmarie Nele', 'Asya Anara', 'Brenda Stern',
'Gitta Nikolina', 'Henriette Usha', 'Sofia Hellen',
'Tammy Grit', 'Tanja Adelina', 'Vjollca Johnnie',
'Andrew Chipper', 'Badr Odhiambo', 'Dionisio Schuyler',
'Royston Min', 'Viktor Eka', 'Abrahan Mack',
'Adde Michal', 'Baldur Sanjin', 'Craig Gutsy',
'Damien Black', 'Gilberto Mathias', 'Ilkin Urbano',
'Kazuhiko Atallah', 'Ludvig Milivoj', 'Suad Qasim',
'Torcull Diarmuid', 'Viktor Menelaos', 'Zacharie Aimilios',
'Nova Hogarth', 'Maja Ruoho', 'Uta Obando',
'Lidiya Szekeres', 'Chandra MacFarland', 'Szofi Granger',
'Camilla Holmström', 'Lilya Stainthorpe', 'Zofija Kendrick',
'Narelle Moon', 'Barbora MacLean', 'Alexandra Hisakawa',
'Alma María', 'Rosemary Okafor', 'Ige Behringer',
'Filip Traverse', 'Damjan Chapman', 'Wulf Carlevaro',
'Aaron Dreschner', 'Kumar Dahl', 'Eugenio Mataracı',
'Ferran Simen', 'Xavier Hayasaka', 'Luis Moray',
'Marcos Rudaski'
]
Thanks @EvarDion @kopp for parts of this
Note: if you get AttributeError: 'NoneType' object has no attribute 'name_to_id'
it's because it doesn't like quotes. Have to use it like it's shown above.
Describe the bug
VITS is working fine but a number of other multilingual models are failing to run because of a configuration issue.
A partial list of the models that don't work are:
tts_models/multilingual/multi-dataset/xtts_v2 tts_models/multilingual/multi-dataset/bark tts_models/en/multi-dataset/tortoise-v2
To Reproduce
Download and run the docker image on windows 10 following the Tutorial instruction here:
The setting I used was GPU = true.
Expected behavior
Models should run.
Logs
Environment
Additional context
I did a git clone of the latest repo into the docker container and reinstalled all of the dependencies and the error still occurs so I'm guess its still an unresolved issue.
No response