Closed rinechran closed 5 months ago
Hi @rinechran
When did you last update? There was a code update 5 days ago that should resolve any Japanese training issues, could you confirm you are up to date?
Code update https://github.com/erew123/alltalk_tts/commit/86c64fc9ba0f223df5844efed6e03d803bfc55d7
How to update https://github.com/erew123/alltalk_tts?tab=readme-ov-file#-updating
Thanks
I've seen the issue, and I've also confirmed that the version I'm writing is one that fixed the issue. I'm currently using the AllTalk v1.9 release version.
@rinechran Ive just literally updated the Finetuning again about 5 minutes ago. Would you like to update to that version with a git pull
in the alltalk_tts
directory and give it a try.
I ran the finetuning today after your earlier message and selected Japanese and didn't have any issue with Step 2 of Finetuning. So I am puzzled what could be occurring on your system.
If you still have a failure after the update, would you be able to send me a diagnostic log file please? https://github.com/erew123/alltalk_tts?tab=readme-ov-file#-how-to-make-a-diagnostics-report-file
@rinechran Just to confirm I have kicked off a "ja" Japanese Finetuning again and its processing through as it should do. So I would definitely need an diagnostics file to try figure whats going on (after you have updated and tested).
Thanks
It occurred in the pre-processing process, so it doesn't go to step 2. If you need it, I can give you the wav file that I preprocessed. I'll try to update it again.
@rinechran Please could you send me the actual diagnostics log file as I need to look at your python environment
https://github.com/erew123/alltalk_tts?tab=readme-ov-file#-how-to-make-a-diagnostics-report-file
You can start atsetup.bat and its in the menus there to create the log file.
Thanks
Hi @rinechran
Please download tokenizer.py from here https://drive.google.com/file/d/13pg9IEJPWuW-_cHKK_T-AyKq2--ERp7q/view?usp=sharing and save it over the top of your existing tokenizer file.
You will need to restart finetuning.py, which should resolve that base issue.
Obviously you may have other issues to resolve and the instructions are on the menus or in the general documentation.
Thanks
Step 1's problem has been resolved But I found an error in step 2.language is: ja
Traceback (most recent call last):
File "/mnt/ssd/project/alltalk/finetune.py", line 1348, in train_model
config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(language, num_epochs, batch_size, grad_acumm, train_csv, eval_csv, output_path=str(output_path), max_audio_length=max_audio_length)
File "/mnt/ssd/project/alltalk/finetune.py", line 562, in train_gpt
model = GPTTrainer.init_from_config(config)
File "/mnt/ssd/project/alltalk/venv-tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/trainer/gpt_trainer.py", line 504, in init_from_config
return GPTTrainer(config)
File "/mnt/ssd/project/alltalk/venv-tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/trainer/gpt_trainer.py", line 79, in __init__
self.xtts.tokenizer = VoiceBpeTokenizer(self.args.tokenizer_file)
File "/mnt/ssd/project/alltalk/venv-tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 622, in __init__
self.tokenizer = Tokenizer.from_file(vocab_file)
Exception: No such file or directory (os error 2)
metadata_train.csv Maybe there's a problem with training tokenization
Hi @rinechran
I've just tested through here and it has gone through fine:
Did you resolve the issues on the Pre-flight checklist? As missing files will cause a "Exception: No such file or directory (os error 2)"
In your earlier image Cublas was not being detected and nor was the Base model. So please check you have resolved the issues on the Pre-flight check. The instructions are on the tabs as shown. Or can find instructions on the main page https://github.com/erew123/alltalk_tts?tab=readme-ov-file#-important-requirements-cuda-118
Thanks
Hi @rinechran
If you have further problems, please let me know and I will look at them if possible.
Thanks
Please generate a diagnostics report and upload the "diagnostics.log".
https://github.com/erew123/alltalk_tts/tree/main?#-how-to-make-a-diagnostics-report-file
Describe the bug Ja language fine tuning is not possible
To Reproduce Ja language fine tuning is not possible
Screenshots
Text/logs File "/mnt/ssd/project/alltalk_tts/finetune.py", line 1006, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(target_language=language, whisper_model=whisper_model, out_path=out_path, eval_split_number=eval_split_number, gradio_progress=progress) File "/mnt/ssd/project/alltalk_tts/finetune.py", line 229, in format_audio_list sentence = multilingual_cleaners(sentence, target_language) File "/mnt/ssd/project/alltalk_tts/venv-tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 558, in multilingual_cleaners text = expand_numbers_multilingual(text, lang) File "/mnt/ssd/project/alltalk_tts/venv-tts/lib/python3.10/site-packages/TTS/tts/layers/xtts/tokenizer.py", line 538, in expand_numbers_multilingual text = re.sub(_ordinal_re[lang], lambda m: _expand_ordinal(m, lang), text) KeyError: 'ja' format_audio_list^C[FINETUNE] Received interrupt signal. Cleaning up and exiting...
Desktop (please complete the following information): AllTalk was updated: tex Text-generation-webUI was updated: yes
Additional context Add any other context about the problem here.