Variance Loss RuntimeError

roedoejet commented 2 years ago

Hi there,

I'm trying to train a model with LJ data, but at step 50331 I get:

  File "train.py", line 339, in <module>
    train(0, args, configs, batch_size, num_gpus)
  File "train.py", line 196, in train
    ) = Loss.variance_loss(batch, output, step=step)
  File "/gpfs/fs2c/nrc/ict/portage/u/tts/code/Comprehensive-E2E-TTS-new/model/loss.py", line 206, in variance_loss
    ctc_loss = self.sum_loss(attn_logprob=attn_logprob, in_lens=src_lens, out_lens=mel_lens)
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/gpfs/fs2c/nrc/ict/portage/u/tts/code/Comprehensive-E2E-TTS-new/model/loss.py", line 249, in forward
    target_lengths=key_lens[bid : bid + 1],
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1502, in forward
    self.zero_infinity)
  File "/space/partner/nrc/work/ict/portage/u/tts/opt/miniconda3/envs/jets/lib/python3.7/site-packages/torch/nn/functional.py", line 2201, in ctc_loss
    zero_infinity)
RuntimeError: Expected input_lengths to have value at most 693, but got value 694 (while checking arguments for ctc_loss_gpu)

I'm using the default configuration (https://github.com/keonlee9420/Comprehensive-E2E-TTS/tree/main/config/LJSpeech) except I reduced the batch size to 10 to fit my GPU. Is the reason this only showed up now due to var_start_steps? Any advise would be appreciated.

keonlee9420 commented 2 years ago

Hi @roedoejet , thanks for the report. Yes, it was from the length mismatch of variances, and I just fixed and updated the code, so please check out the latest checkpoint.

keonlee9420 commented 2 years ago

Close due to inactivity.

keonlee9420 / Comprehensive-E2E-TTS

Variance Loss RuntimeError #2