No .pth file generated from train_gpt_xtts.py

EniddeallA commented 10 hours ago

Hello, I have followed all the steps and got it to work, but the train_gpt_xtts.py file does not generate any model file inside checkpoints folder. This is what I run:

CUDA_VISIBLE_DEVICES=0 python train_gpt_xtts.py \
--output_path checkpoints/ \
--metadatas datasets/metadata_train.csv,datasets/metadata_eval.csv,ar \
--num_epochs 5 \
--batch_size 4 \
--grad_acumm 4 \
--max_text_length 400 \
--max_audio_length 330750 \
--weight_decay 1e-2 \
--lr 5e-6 \
--save_step 50000

This is the checkpoints directory generated: These are the last lines of the trainer_0_log.txt:


  --> EVAL PERFORMANCE
     | > avg_loader_time: 0.29085401745585643 (-0.05632297714035239)
     | > avg_loss_text_ce: 0.05398880361349552 (-0.0015507079177088517)
     | > avg_loss_mel_ce: 3.4326020618537805 (-0.02727066696464231)
     | > avg_loss: 3.4865908715632057 (-0.028821384751951395)

Checkpoint saved in dir: checkpoints/GPT_XTTS_FT-October-30-2024_07+47PM-f1662ee

Any help would be appreciated, and thank you.

EniddeallA commented 9 hours ago

I don't know if this could help but I also get these warnings when I launch train_gpt_xtts.py:

/mnt/d/XTTSv2-Finetuning-for-New-Languages/TTS/utils/io.py:54: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  return torch.load(f, map_location=map_location, **kwargs)
 > Loading checkpoint with 1779 additional tokens.
/mnt/d/XTTSv2-Finetuning-for-New-Languages/TTS/tts/layers/tortoise/arch_utils.py:336: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  self.mel_norms = torch.load(f)
/mnt/d/XTTSv2-Finetuning-for-New-Languages/TTS/tts/layers/xtts/trainer/gpt_trainer.py:186: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  dvae_checkpoint = torch.load(self.args.dvae_checkpoint, map_location=torch.device("cpu"))

anhnh2002 commented 8 hours ago

It may come from not having enough data to reach save steps. Try decreasing the save_step parameter, e.g., 500, 5000, ...

anhnh2002 / XTTSv2-Finetuning-for-New-Languages

No .pth file generated from train_gpt_xtts.py #25