coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
31.64k stars 3.78k forks source link

[Bug] RuntimeError: shape '[64, 31, -1]' is invalid for input of size 8064 #3798

Open lyjgo opened 5 days ago

lyjgo commented 5 days ago

Describe the bug

I met an error when I run the train_tacotron_ddc.py in TTS/recipes/ljspeech/tacotron2-DDC with the default config. The error and the config are as follows: ERROR 6208dece6097660c8fa1dc0b47c2daa

CONFIG audio_config = BaseAudioConfig( sample_rate=22050, do_trim_silence=True, trim_db=60.0, signal_norm=False, mel_fmin=0.0, mel_fmax=8000, spec_gain=1.0, log_func="np.log", ref_level_db=20, preemphasis=0.0, )

config = Tacotron2Config( # This is the config that is saved for the future use audio=audio_config, batch_size=64, eval_batch_size=16, num_loader_workers=4, num_eval_loader_workers=4, run_eval=True, test_delay_epochs=-1, r=6, gradual_training=[[0, 6, 64], [10000, 4, 32], [50000, 3, 32], [100000, 2, 32]], double_decoder_consistency=True, epochs=1000, text_cleaner="phoneme_cleaners", use_phonemes=True, phoneme_language="en-us", phoneme_cache_path=os.path.join(output_path, "phoneme_cache"), precompute_num_workers=8, print_step=25, print_eval=True, mixed_precision=False, output_path=output_path, datasets=[dataset_config], )

Is there anything I can do to solve this problem? Thanks

To Reproduce

  1. Setup Python Environment locall
  2. clone repo, and install locally via pip install -e .
  3. download LJSpeech dataset
  4. run TTS/recipes/ljspeech/tacotron2-DDC/train_tacotron_ddc.py

Expected behavior

No response

Logs

No response

Environment

"Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.2.0+cu118",
        "TTS": "0.22.0",
        "numpy": "1.22.0"
    }

Additional context

No response

lyjgo commented 5 days ago

full ERROR information: fa6ca25a66ed35bd073cbea311a4046