using newly murged xtts v2 speakers

danielw97 commented 9 months ago

Opening a new issue to keep things tidy. Is there a way currently to use the new speakers that were recently released in coqui tts v0.22.0? I've tried passing the --speaker flag although it currently doesn't seem to accept spaces even when quoting properly, at least it's giving me a syntax error. Completely fine if not, however I assume the initial framework is already in place for this. Thanks in advance.

aedocw commented 9 months ago

I haven't been able to try the new speakers with the latest TTS release. I get the following just trying to get their list of studio speakers.

ᐅ tts --model_name tts_models/multilingual/multi-dataset/xtts_v2 --list_speaker_idx
 > tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
 > Using model: xtts
 > Available speaker ids: (Set --speaker_idx flag to one of these values to use the multi-speaker model.
Traceback (most recent call last):
  File "/home/doc/repos/TTS/.venv/bin/tts", line 8, in <module>
    sys.exit(main())
  File "/home/doc/repos/TTS/.venv/lib/python3.10/site-packages/TTS/bin/synthesize.py", line 443, in main
    print(synthesizer.tts_model.speaker_manager.name_to_id)
AttributeError: 'NoneType' object has no attribute 'name_to_id'

As soon as I'm able to resolve this and test using these speakers, I'll add it to epub2tts.

danielw97 commented 9 months ago

I got the same error, and found that you have to be using the latest version of the model as apposed to the 2.0.2 version. It might just be my testing, although it looks as though the model is performing better and they may have retrained it since the initial release at least looking at the commits on huggingface suggest this. Edit: there's a new speakers.pth now present also as of several days ago.

aedocw commented 9 months ago

Ah thanks, that's what I thought, but ... I had it download the latest model (I thought?). I've never downloaded the model on my own, I've always let TTS fetch it for me.

danielw97 commented 9 months ago

Interesting. You've probably done this already, although if you make sure you're running the latest tts version released with pip install --upgrade tts and then try to get it to download the model does that fix it making sure the original model is renamed to something else? At least for me I had to get rid of my original model directory in ~/.local/share/tts so it would register that it had to redownload.

aedocw commented 9 months ago

Yeah I just removed the model from .local/share/tts and then it reloaded it. I expected it would have pulled down the latest model on it's own. I'll fetch from https://huggingface.co/coqui/XTTS-v2 later today and play with it. I'll also see if I can figure out why it didn't automatically fetch the newest model version, as we'll have to write instructions for folks if they have to take some manual steps.

danielw97 commented 9 months ago

That's interesting, at least for me it downloaded the correct model after I cleared that directory although will do some checking.

aedocw commented 9 months ago

I have been able to use the newly released studio speakers (58!!) and made samples of them. They sound pretty amazing, will be great to add them as simplified options.

aedocw commented 9 months ago

Finally had a chance to add this in. Check README for details.

danielw97 commented 9 months ago

Thanks for murging this, looking forward to checking it out. Unfortunately, I'm receiving the following error when using a model I've finetuned after pulling the latest changes:

Error: Model is not multi-speaker but speaker is provided. ... Retrying (0 retries left)
Followed by this traceback:

Traceback (most recent call last):
File "C:\Users\daniel\Documents\epub2tts\epub2tts.py", line 639, in
main()
File "C:\Users\daniel\Documents\epub2tts\epub2tts.py", line 628, in main
mybook.read_book(
File "C:\Users\daniel\Documents\epub2tts\epub2tts.py", line 419, in read_book

str(ratio)
^^^^^
UnboundLocalError: cannot access local variable 'ratio' where it is not associated with a value

aedocw commented 9 months ago

Hah, of course I did not properly test :( I get the same thing, I know what I did wrong, I'll fix, text more carefully, and merge shortly.

aedocw / epub2tts

using newly murged xtts v2 speakers #115