Hi, thanks for sharing the fine-tuning code! When I run the training code !python train_dvae.py --dataset_path processed_dataset --language zh, I got the following error. Could you help me resolve the issue? Thanks!
dvae.pth already exists. Skipping download.
mel_stats.pth already exists. Skipping download.
Computing mel-spectrograms: 100% 155/155 [00:02<00:00, 56.19it/s]
Computing mel-spectrograms: 100% 39/39 [00:00<00:00, 53.21it/s]
> Filtering invalid eval samples!!
> Total eval samples after filtering: 39
> Sampling by language: dict_keys(['zh'])
Epoch 1/20: 0% 0/16 [00:00<?, ?it/s]/usr/lib/python3.10/multiprocessing/popen_fork.py:66: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
self.pid = os.fork()
Epoch 1/20: 6% 1/16 [00:01<00:21, 1.46s/it, commit_loss=0.238, global_step=1, loss=0.35, recon_loss=0.112, step=1/16]/usr/local/lib/python3.10/dist-packages/TTS/tts/layers/xtts/dvae.py:381: UserWarning: Using a target size (torch.Size([10, 80, 1688])) that is different to the input size (torch.Size([10, 80, 1687])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
recon_loss = self.loss_fn(img, out, reduction="none")
Epoch 1/20: 6% 1/16 [00:01<00:22, 1.51s/it, commit_loss=0.238, global_step=1, loss=0.35, recon_loss=0.112, step=1/16]
Traceback (most recent call last):
File "/content/xtts-finetune-tests/dvae-finetune/train_dvae.py", line 272, in <module>
train_dvae(args)
File "/content/xtts-finetune-tests/dvae-finetune/train_dvae.py", line 160, in train_dvae
recon_loss, commitment_loss, out = dvae(batch['mel'].cuda())
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/TTS/tts/layers/xtts/dvae.py", line 381, in forward
recon_loss = self.loss_fn(img, out, reduction="none")
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 3365, in mse_loss
expanded_input, expanded_target = torch.broadcast_tensors(input, target)
File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 76, in broadcast_tensors
return _VF.broadcast_tensors(tensors) # type: ignore[attr-defined]
RuntimeError: The size of tensor a (1687) must match the size of tensor b (1688) at non-singleton dimension 2
Hi, thanks for sharing the fine-tuning code! When I run the training code
!python train_dvae.py --dataset_path processed_dataset --language zh
, I got the following error. Could you help me resolve the issue? Thanks!