Closed thoraxe closed 11 months ago
This may be related to https://github.com/DigitalPhonetics/IMS-Toucan/issues/85 except that it is happening for me when attempting to fine-tune.
In TextFrontend.py
the offending code seems to be here:
if language == "en":
self.g2p_lang = "en"
self.expand_abbreviations = english_text_expansion
if not silent:
print("Created an English Text-Frontend")
Note that the original code is en-us
but, even when changed to en
, it still reports "not supported by backend", although that actually isn't supported by the back-end.
phonemize appears to be working on the system in question:
echo "hello world" | phonemize -l en-us -b espeak
həloʊ wɜːld
FWIW I did not have this problem with v2.5, but the fine-tuning output from 2.5 was unusable with this dataset.
OK, I did some further digging. I went to my other system, which is Ubuntu and WSL, and I was able to run things just fine with commit e41e266ccacf282a9854d562f9e3d604f1cf245b on that system. So I have been able to get training to start on two different datasets on that system.
I went to the new system and checked out that commit, reverted changes for Portaspeech, and tried to run the fine-tuning example. At first it didn't work, but then I blew away both the corpora and models folders, re-downloaded the models, and tried again, and it seems to be working at the moment with that commit.
I'm going to let things run for a while and I'll see how it goes and report back.
OK, I'm not really sure what was going on, but this problem seems to have resolved itself. However, I did upgrade to torch/torchaudio2.
I'm going to leave this open for now and try again with a completely fresh environment sometime soon.
I finally came back to this and was able to do some more testing. It appears that the torch 1.x line does not work in environments with newer CUDA, or that's my hypothesis. Installing the latest torch things seemed to do the trick:
pip install torch torchvision torchaudio
This resulted in:
alias-free-torch 0.0.6
torch 2.0.1
torch-complex 0.4.3
torchaudio 2.0.2
torchvision 0.15.2
Training then runs as expected.
Cool, thanks for your update!
ToucanTTS v2.5 was creating very poor audio after fine-tuning on a new dataset I created. I had luck with v2.4, so I decided to try to revert. Now I am getting the following error: