aedocw / epub2tts

Turn an epub or text file into an audiobook
Apache License 2.0
445 stars 44 forks source link

loading finetuned xtts v2 model #89

Closed danielw97 closed 7 months ago

danielw97 commented 7 months ago

Hi again, This may very well be something I'm doing wrong, however I"m trying to load a finetuned xtts v2 model. I've placed the config, model and vocab files in a directory, for this example we'll refer to as modelname in the root where all of the other models are stored. However, when I call epub2tts --xtts --model=modelname sample.txt I receive the following error: Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Scripts\epub2tts.exe__main__.py", line 7, in
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\epub2tts.py", line 385, in main
mybook.read_book(voice_samples=args.xtts, engine=args.engine, openai=args.openai, model_name=args.model, speaker=args.speaker, bitrate=args.bitrate)
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\epub2tts.py", line 259, in read_book
self.tts = TTS(model_name).to(self.device)
^^^^^^^^^^^^^^^
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\TTS\api.py", line 85, in init
self.load_model_by_name(model_name, gpu)
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\TTS\api.py", line 166, in load_model_by_name
self.load_tts_model_by_name(model_name, gpu)
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\TTS\api.py", line 195, in load_tts_model_by_name
model_path, config_path, vocoder_path, vocoder_config_path, model_dir = self.download_model_by_name(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\TTS\api.py", line 149, in download_model_by_name
model_path, config_path, model_item = self.manager.download_model(model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\TTS\utils\manage.py", line 407, in download_model model_item, model_full_name, model, md5sum = self._set_model_item(model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\daniel\Documents\epub2tts.venvgpu\Lib\site-packages\TTS\utils\manage.py", line 322, in _set_model_item model_type, lang, dataset, model = model_name.split("/")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: not enough values to unpack (expected 4, got 1)
Is there something I'm doing wrong here? Thanks in advance.

aedocw commented 7 months ago

I don't think you're doing anything wrong, what you describe is exactly how I am using fine-tuned models. I suspect this is an issue with windows path names. The error is from TTS but possibly a result of how I'm specifying the path to the model.

I have never tried running this in windows directly (I run it in WSL). I'll try to install and run this directly in windows and see if I can recreate the issue, and go from there.

danielw97 commented 7 months ago

Okay thanks, I can also run in wsl as well if that's easier for the moment.

danielw97 commented 7 months ago

Edit: I may have figured this out, as I wasn't passing a wav file as I perhaps falsely assumed it would use the data it was trained on. I've passed a wav file now and I assume the finetuned model is being used as there is significant improvement.

aedocw commented 7 months ago

OH! I had NOT looked closely at your example of how you are calling epub2tts and I noticed a few things that could be impacting it. Calling it the same way you are, I am able to recreate this and I think it's because of argument parsing. I need to improve the error handling so it gives better feedback.

When calling "--xtts", the code expects at least one wav/mp3 file as a voice sample. Second, the epub or txt file should be the first argument supplied. Could you try "epub2tts sample.txt --xtts sample.wav --model modelname" and let me know if that works?

danielw97 commented 7 months ago

That's great, we're responding at nearly the same time, although I've just edited my previous comment. I've figured this out now as it's expecting a wav file which I'm now supplying.

aedocw commented 7 months ago

OK got this sorted out. Your initial call would now exit with this error: "EpubToAudiobook: error: argument --xtts: expected one argument"

It is OK to specify the source epub or text file at the end of the line as you initially tried. I'm merging the minor fixes now. Thank you again and as always for reporting the problems you find, glad you are using it!