daswer123 / xtts-finetune-tests

In this repository I will be running various experiments on finetune different parts for xtts
MIT License
7 stars 2 forks source link

RuntimeError: The size of tensor a (1687) must match the size of tensor b (1688) at non-singleton dimension 2 #2

Open KevinWang676 opened 2 months ago

KevinWang676 commented 2 months ago

Hi, thanks for sharing the fine-tuning code! When I run the training code !python train_dvae.py --dataset_path processed_dataset --language zh, I got the following error. Could you help me resolve the issue? Thanks!

dvae.pth already exists. Skipping download.
mel_stats.pth already exists. Skipping download.
Computing mel-spectrograms: 100% 155/155 [00:02<00:00, 56.19it/s]
Computing mel-spectrograms: 100% 39/39 [00:00<00:00, 53.21it/s]
 > Filtering invalid eval samples!!
 > Total eval samples after filtering: 39
 > Sampling by language: dict_keys(['zh'])
Epoch 1/20:   0% 0/16 [00:00<?, ?it/s]/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
  self.pid = os.fork()
Epoch 1/20:   6% 1/16 [00:01<00:21,  1.46s/it, commit_loss=0.238, global_step=1, loss=0.35, recon_loss=0.112, step=1/16]/usr/local/lib/python3.10/dist-packages/TTS/tts/layers/xtts/dvae.py:381: UserWarning: Using a target size (torch.Size([10, 80, 1688])) that is different to the input size (torch.Size([10, 80, 1687])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  recon_loss = self.loss_fn(img, out, reduction="none")
Epoch 1/20:   6% 1/16 [00:01<00:22,  1.51s/it, commit_loss=0.238, global_step=1, loss=0.35, recon_loss=0.112, step=1/16]
Traceback (most recent call last):
  File "/content/xtts-finetune-tests/dvae-finetune/train_dvae.py", line 272, in <module>
    train_dvae(args)
  File "/content/xtts-finetune-tests/dvae-finetune/train_dvae.py", line 160, in train_dvae
    recon_loss, commitment_loss, out = dvae(batch['mel'].cuda())
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/TTS/tts/layers/xtts/dvae.py", line 381, in forward
    recon_loss = self.loss_fn(img, out, reduction="none")
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 3365, in mse_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 76, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore[attr-defined]
RuntimeError: The size of tensor a (1687) must match the size of tensor b (1688) at non-singleton dimension 2
KevinWang676 commented 2 months ago

@daswer123 Could you take a look at this issue? I kept getting the same RuntimeError. Thanks!