daswer123 / xtts-finetune-webui

Slightly improved official version for finetune xtts
190 stars 64 forks source link

Not able to train with 2x 3090 GPU's #27

Open BrianStark opened 3 months ago

BrianStark commented 3 months ago

Hello there.

Since yesterday I have updated the program with the new version, but somehow when I want to train I keep getting this error message, especially the bottom error message, I don't quite understand why it has problems running with two graphics cards.

Is there perhaps someone who knows how I can fix this so that I can simply use the program? It would be very nice, thanks in advance


Traceback (most recent call last): File "D:\AI\TTS-AI\xtts-finetune-webui\xtts_demo.py", line 370, in train_model speaker_xtts_path,config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(custom_model,version,language, num_epochs, batch_size, grad_acumm, train_csv, eval_csv, output_path=output_path, max_audio_length=max_audio_length) File "D:\AI\TTS-AI\xtts-finetune-webui\utils\gpt_train.py", line 184, in train_gpt trainer = Trainer( File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\trainer\trainer.py", line 435, in init self.use_cuda, self.num_gpus = self.setup_training_environment(args=args, config=config, gpu=gpu) File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\trainer\trainer.py", line 764, in setup_training_environment use_cuda, num_gpus = setup_torch_training_env( File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\trainer\trainer_utils.py", line 100, in setup_torch_training_env raise RuntimeError( RuntimeError: [!] 2 active GPUs. Define the target GPU by CUDA_VISIBLE_DEVICES. For multi-gpu training use TTS/bin/distribute.py. Traceback (most recent call last): File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\gradio\queueing.py", line 527, in process_events response = await route_utils.call_process_api( File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\gradio\route_utils.py", line 270, in call_process_api output = await app.get_blocks().process_api( File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\gradio\blocks.py", line 1856, in process_api data = await self.postprocess_data(fn_index, result["prediction"], state) File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\gradio\blocks.py", line 1634, in postprocess_data self.validate_outputs(fn_index, predictions) # type: ignore File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\gradio\blocks.py", line 1610, in validate_outputs raise ValueError( ValueError: An event handler (train_model) didn't receive enough output values (needed: 6, received: 5). Wanted outputs: [<gradio.components.label.Label object at 0x000001CC018DBC70>, <gradio.components.textbox.Textbox object at 0x000001CC018D9000>, <gradio.components.textbox.Textbox object at 0x000001CC018D8880>, <gradio.components.textbox.Textbox object at 0x000001CC018DBCD0>, <gradio.components.textbox.Textbox object at 0x000001CC018D8370>, <gradio.components.textbox.Textbox object at 0x000001CC018D9E40>] Received outputs: ["The training was interrupted due an error !! Please check the console to check the full error message! Error summary: Traceback (most recent call last): File "D:\AI\TTS-AI\xtts-finetune-webui\xtts_demo.py", line 370, in train_model speaker_xtts_path,config_path, original_xtts_checkpoint, vocab_file, exp_path, speaker_wav = train_gpt(custom_model,version,language, num_epochs, batch_size, grad_acumm, train_csv, eval_csv, output_path=output_path, max_audio_length=max_audio_length) File "D:\AI\TTS-AI\xtts-finetune-webui\utils\gpt_train.py", line 184, in train_gpt trainer = Trainer( File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\trainer\trainer.py", line 435, in init self.use_cuda, self.num_gpus = self.setup_training_environment(args=args, config=config, gpu=gpu) File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\trainer\trainer.py", line 764, in setup_training_environment use_cuda, num_gpus = setup_torch_training_env( File "D:\AI\TTS-AI\xtts-finetune-webui\venv\lib\site-packages\trainer\trainer_utils.py", line 100, in setup_torch_training_env raise RuntimeError( RuntimeError: [!] 2 active GPUs. Define the target GPU by CUDA_VISIBLE_DEVICES. For multi-gpu training use TTS/bin/distribute.py. ", "", "", "", ""]

TheMaxik commented 3 months ago

AFAIK there is no Multi GPU support but you can add CUDA_VISIBLE_DEVICES=1 as a Environment variable to tell CUDA wich GPU to use. Hince you cannot retrain a model and you training theoretically did start, you need to delete the run and ready folder in your "finetune_models" folder. Keep the dataset

EDIT: you can retrain a model. But in your case the training failed so you need to delete the run and or ready folders