AI4Bharat / Indic-TTS

Text-to-Speech for languages of India
MIT License
122 stars 28 forks source link

Error when loading en+hi model #7

Open tanmayh-fg opened 1 year ago

tanmayh-fg commented 1 year ago

I am getting this error when trying to load the newly uploaded en+hi model

Traceback (most recent call last): File "", line 35, in models[lang] = Synthesizer( File "Indic-TTS/env/lib/python3.8/site-packages/TTS/utils/", line 91, in init self._load_tts(tts_checkpoint, tts_config_path, use_cuda) File "Indic-TTS/env/lib/python3.8/site-packages/TTS/utils/", line 190, in _load_tts self.tts_model.load_checkpoint(self.tts_config, tts_checkpoint, eval=True) File "Indic-TTS/env/lib/python3.8/site-packages/TTS/tts/models/", line 828, in load_checkpoint self.load_state_dict(state["model"]) File "Indic-TTS/env/lib/python3.8/site-packages/torch/nn/modules/", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for ForwardTTS: size mismatch for emb_g.weight: copying a param with shape torch.Size([4, 512]) from checkpoint, the shape in current model is torch.Size([2, 512]).

Any help would be appreciated. @GokulNC @ashwin-014

GokulNC commented 1 year ago

Sorry for late response, but can you please try again by redownloading? It is still working for me.

adimyth commented 6 months ago

Hey @GokulNC I facing the same issue. Do you know the probable cause? Traceback -

python3 -m TTS.bin.synthesize --text "Namaste! How are you? Kal milte hai" --model_path "models/v1/hin/fastpitch/best_model.pth" --config_path "models/v1/hin/fastpitch/config.json" --vocoder_path "models/v1/hin/hifigan/best_model.pth" --vocoder_config_path "models/v1/hin/hifigan/config.json" --speaker_idx male --out_path "temp.wav" --use_cuda true
 > Using model: fast_pitch
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > resample:False
 | > num_mels:80
 | > log_func:np.log
 | > min_level_db:-100
 | > frame_shift_ms:None
 | > frame_length_ms:None
 | > ref_level_db:20
 | > fft_size:1024
 | > power:1.5
 | > preemphasis:0.0
 | > griffin_lim_iters:60
 | > signal_norm:False
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > pitch_fmin:0.0
 | > pitch_fmax:640.0
 | > spec_gain:1.0
 | > stft_pad_mode:reflect
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > do_sound_norm:False
 | > do_amp_to_db_linear:True
 | > do_amp_to_db_mel:True
 | > do_rms_norm:False
 | > db_level:None
 | > stats_path:None
 | > base:2.718281828459045
 | > hop_length:256
 | > win_length:1024
 > Init speaker_embedding layer.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/lib/python3.10/", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/ai-inference/TTS/TTS/bin/", line 425, in <module>
  File "/usr/ai-inference/TTS/TTS/bin/", line 322, in main
    synthesizer = Synthesizer(
  File "/usr/ai-inference/TTS/TTS/utils/", line 78, in __init__
    self._load_tts(tts_checkpoint, tts_config_path, use_cuda)
  File "/usr/ai-inference/TTS/TTS/utils/", line 120, in _load_tts
    self.tts_model.load_checkpoint(self.tts_config, tts_checkpoint, eval=True)
  File "/usr/ai-inference/TTS/TTS/tts/models/", line 839, in load_checkpoint
  File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ForwardTTS:
    size mismatch for emb_g.weight: copying a param with shape torch.Size([4, 512]) from checkpoint, the shape in current model is torch.Size([2, 512]).
adimyth commented 6 months ago

@tanmayh-fg were you able to solve this?

monali-2210 commented 1 month ago

I am facing same issues.