keonlee9420 / Comprehensive-E2E-TTS

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
144 stars 19 forks source link

Variance Loss RuntimeError #2

Closed roedoejet closed 2 years ago

roedoejet commented 2 years ago

Hi there,

I'm trying to train a model with LJ data, but at step 50331 I get:

  File "train.py", line 339, in <module>
    train(0, args, configs, batch_size, num_gpus)
  File "train.py", line 196, in train
    ) = Loss.variance_loss(batch, output, step=step)
  File "/gpfs/fs2c/nrc/ict/portage/u/tts/code/Comprehensive-E2E-TTS-new/model/loss.py", line 206, in variance_loss
    ctc_loss = self.sum_loss(attn_logprob=attn_logprob, in_lens=src_lens, out_lens=mel_lens)
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/gpfs/fs2c/nrc/ict/portage/u/tts/code/Comprehensive-E2E-TTS-new/model/loss.py", line 249, in forward
    target_lengths=key_lens[bid : bid + 1],
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1502, in forward
    self.zero_infinity)
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/functional.py", line 2201, in ctc_loss
    zero_infinity)
RuntimeError: Expected input_lengths to have value at most 693, but got value 694 (while checking arguments for ctc_loss_gpu)

I'm using the default configuration (https://github.com/keonlee9420/Comprehensive-E2E-TTS/tree/main/config/LJSpeech) except I reduced the batch size to 10 to fit my GPU. Is the reason this only showed up now due to var_start_steps? Any advise would be appreciated.

keonlee9420 commented 2 years ago

Hi @roedoejet , thanks for the report. Yes, it was from the length mismatch of variances, and I just fixed and updated the code, so please check out the latest checkpoint.

keonlee9420 commented 2 years ago

Close due to inactivity.