KoljaB / RealtimeTTS

Converts text to speech in realtime
1.39k stars 119 forks source link

Changing model. #71

Closed Yokopops closed 2 months ago

Yokopops commented 2 months ago

I tried changing the model in my engine.py for coqui and it seems to error out/not download the full model, if I download the model manually with: tts --text "this is a test." --model_name tts_models/en/jenny/jenny command in CMD and move it into the folder it still does not function, am I missing something or just being dumb?

Initializing Coqui Engine... Downloading config.json to P:\AI-Audio\RealtimeTTS\Models\Jenny\config.json... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.0/21.0 [00:00<00:00, 21.0kiB/s] Downloading model.pth to P:\AI-Audio\RealtimeTTS\Models\Jenny\model.pth... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.0/21.0 [00:00<00:00, 18.4kiB/s] Downloading vocab.json to P:\AI-Audio\RealtimeTTS\Models\Jenny\vocab.json... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.0/21.0 [00:00<00:00, 8.47kiB/s] Downloading speakers_xtts.pth to P:\AI-Audio\RealtimeTTS\Models\Jenny\speakers_xtts.pth... 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21.0/21.0 [00:00<00:00, 21.1kiB/s] Error loading model for checkpoint P:\AI-Audio\RealtimeTTS\Models\Jenny: Expecting value: line 1 column 1 (char 0) CoquiEngine: Error initializing main coqui engine model: Expecting value: line 1 column 1 (char 0)
Traceback (most recent call last): File "p:\AI-Audio\RealtimeTTS\RealtimeTTS\engines\coqui_engine.py", line 502, in _synthesize_worker tts = load_model(checkpoint, tts) File "p:\AI-Audio\RealtimeTTS\RealtimeTTS\engines\coqui_engine.py", line 474, in load_model
config = load_config((os.path.join(checkpoint, "config.json"))) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\site-packages\TTS\config__init.py", line 92, in load_config data = read_json_with_comments(config_path) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\site-packages\TTS\config__init.py", line 21, in read_json_with_comments return json.loads(input_str) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\json\init.py", line 346, in loads return _default_decoder.decode(s) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) Process Process-1: Traceback (most recent call last): File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 314, in _bootstrap self.run() File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "p:\AI-Audio\RealtimeTTS\RealtimeTTS\engines\coqui_engine.py", line 502, in _synthesize_worker tts = load_model(checkpoint, tts) File "p:\AI-Audio\RealtimeTTS\RealtimeTTS\engines\coqui_engine.py", line 474, in load_model config = load_config((os.path.join(checkpoint, "config.json"))) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\site-packages\TTS\config\init__.py", line 92, in load_config data = read_json_with_comments(config_path) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\site-packages\TTS\config\init.py", line 21, in read_json_with_comments return json.loads(input_str) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\json\init__.py", line 346, in loads return _default_decoder.decode(s) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\json\decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "C:\Users\Yoko\AppData\Local\Programs\Python\Python310\lib\json\decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

KoljaB commented 2 months ago

Ok, I am not sure how your directory structure is and what you put into specific_model and local_models_path, but let me try to explain how my setup is: I have a folder named "D:\Projekte\TestLingu\Linguflex\models\xtts". This is what i submit as local_models_path. Within this folder there are subfolders containing different xtts models, there is a folder with the XTTS 2.0.2 base model "D:\Projekte\TestLingu\Linguflex\models\xtts\v2.0.2" but I also have local model folders like "D:\Projekte\TestLingu\Linguflex\models\xtts\ElonMusk" for example with config.json, model.pth, speakers_xtts.pth, vocab.json and also the mel_stats.pth file copied from XTTS (the mel_stats.pth is not speaker specific and can be copied from here, it just has to be present) If i want to use base XTTS I submit "v2.0.2" as specific_model parameter (or don't submit it at all since this is the default). If I want to use ElonMusk I use "ElonMusk" as specific_model parameter.

Still anything unclear?

Yokopops commented 2 months ago

My folder structure is set up as you explained, This is the what I edited on the engine.py:

model_name="tts_models/en/jenny/jenny", specific_model="Jenny", local_models_path="P:\AI-Audio\RealtimeTTS\Models", voices_path="P:\AI-Audio\RealtimeTTS\Voices", voice: Union[str, List[str]] = "", language="en",

the model I am attempting to load/use is one of the default ones returned from using a tts--list_models command. I'm not sure if is because its not an XTTS model or what?

Appreciate the help.

KoljaB commented 2 months ago

Yes, that is the issue. RealtimeTTS can only process XTTS models, sorry didn't get that.

Yokopops commented 2 months ago

Thanks, was wondering what was happening.