Open luobendewugong opened 2 months ago
The error seems to indicate that the tokenizer is missing, so perhaps you missed a file?
Secondly, why are you manually downloading the models? They can auto download as needed. Maybe you have a good reason, I know downloads from huggingface can be blocked in some area. Without a good reason though, you're just making things harder.
for xtts, the folder path should be like this:
openedai-speech/voices/tts$ ls tts_models--multilingual--multi-dataset--xtts/
config.json vocab.json hash.md5 model.pth speakers_xtts.pth
The error seems to indicate that the tokenizer is missing, so perhaps you missed a file?
Secondly, why are you manually downloading the models? They can auto download as needed. Maybe you have a good reason, I know downloads from huggingface can be blocked in some area. Without a good reason though, you're just making things harder.
Thank you very much for your reply, as I am unable to access https://huggingface.co/
, I tried adding these at the beginning of speech.py
:
import os
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
But it didn't work, for some reason, my other programs can work this way. In the end, there was no other way but to manually download the model. I also tried using docker
but still faced the problem of downloading the model.
for xtts, the folder path should be like this:
openedai-speech/voices/tts$ ls tts_models--multilingual--multi-dataset--xtts/ config.json vocab.json hash.md5 model.pth speakers_xtts.pth
I did indeed not download all the files. I only downloaded config.json
and model.pth
. Thank you very much for your detailed explanation. I'll try again.
Simultaneously, I have also added two questions:
xtts_v2.0.2
, after one use, when performing TTS
, the model will revert back to xtts
. Is it a problem with my version setting for xtts
, should it be set to xtts_v2.0.2
or xtts_v2
? Where do I need to make these settings?xtts_v2
? Which files should it include?Thank you very much for your reply!
for xtts, the folder path should be like this:
openedai-speech/voices/tts$ ls tts_models--multilingual--multi-dataset--xtts/ config.json vocab.json hash.md5 model.pth speakers_xtts.pth
On https://hf-mirror.com/coqui/XTTS-v1/tree/main
, it seems that there are no hash.md5
and speakers_xtts.pth
files. These two files should not be necessary, right? When problems arise, I have downloaded the other three files and placed them in the file directory.
You want coqui Xttsv2
Simultaneously, I have also added two questions:
- No matter how I set it, even if I set the model to
xtts_v2.0.2
, after one use, when performingTTS
, the model will revert back toxtts
. Is it a problem with my version setting forxtts
, should it be set toxtts_v2.0.2
orxtts_v2
? Where do I need to make these settings?
without setting a version, 'xtts' will use the latest version, which is xtts_v2.0.2.
- What should be the folder path for
xtts_v2
? Which files should it include?
the folder I mentioned at the beginning, sorry I'm on mobile, I can be more detailed if needed.
Your explanation has helped me a lot, thank you very much! After I put all the files into tts_models--multilingual--multi-dataset--xtts
, the previous issues were resolved, but the following problems have arisen:
INFO: 127.0.0.1:46278 - "POST /v1/audio/speech HTTP/1.1" 200 OK
2024/09/21 13:12:36.681500 cmd_run.go:1138: WARNING: cannot start document portal: dial unix /run/user/1000/bus: connect: no such file or directory
Additionally, for some reason, Loading model xtts to cuda
is very slow, taking about 5 minutes.
Thank you very much for your reply!
I reinstall the ffmpeg
, and it runs smoothly! Thank you very much! But it seems that it cannot read in a mix of Chinese and English.
For some reason, Loading model xtts to cuda is very slow, taking about 5 minutes.
Try the dev branch, which supports multilingual at the request level. is it a desirable feature to support multilingual at the sentence level?
re: 5 minutes wait, that is odd, which GPU? models are loaded on demand by default.
Hello, I very appreciate your work. I have deployed it using Ubuntu and tried to read Chinese.
I downloaded
zh_CN-huayan-medium.onnx
andzh_CN-huayan-x_low.onnx
from https://hf-mirror.com/rhasspy/piper-voices/tree/main/zh/zh_CN/huayan/medium and https://hf-mirror.com/rhasspy/piper-voices/tree/main/zh/zh_CN/huayan/x_low, and placed them in thevoices
folder.I downloaded
config.json
andmodel.pth
from https://hf-mirror.com/coqui/XTTS-v2/tree/main and placed them in the.local\share\tts\tts_models--multilingual--multi-dataset--xtts folder
.After running
python speech.py
, the following error occurred, and I suspect it is because the text to be read has not been inputed.Could you kindly help me, thank you!
老乡,可以麻烦您,帮忙指导一下具体怎么操作嘛?我也是用open webui然后用这个项目转换语音的,但是,根据官方部署,一直弄不了,可以看我一下我的问题。 麻烦了。https://github.com/matatonic/openedai-speech/issues/66
Hello, I very appreciate your work. I have deployed it using Ubuntu and tried to read Chinese.
I downloaded
zh_CN-huayan-medium.onnx
andzh_CN-huayan-x_low.onnx
from https://hf-mirror.com/rhasspy/piper-voices/tree/main/zh/zh_CN/huayan/medium and https://hf-mirror.com/rhasspy/piper-voices/tree/main/zh/zh_CN/huayan/x_low, and placed them in thevoices
folder.I downloaded
config.json
andmodel.pth
from https://hf-mirror.com/coqui/XTTS-v2/tree/main and placed them in the.local\share\tts\tts_models--multilingual--multi-dataset--xtts folder
.After running
python speech.py
, the following error occurred, and I suspect it is because the text to be read has not been inputed.Could you kindly help me, thank you!