DigitalPhonetics / IMS-Toucan

Multilingual and Controllable Text-to-Speech Toolkit of the Speech and Language Technologies Group at the University of Stuttgart.
Apache License 2.0
1.17k stars 135 forks source link

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same #151

Closed thoraxe closed 2 weeks ago

thoraxe commented 1 year ago

IMS-Toucan v2.4 (because of https://github.com/DigitalPhonetics/IMS-Toucan/issues/134) Python 3.9.16

git clone https://github.com/DigitalPhonetics/IMS-Toucan
cd IMS-Toucan
pip install -r requirements.txt
python run_model_downloader.py
# change run_text_to_file_reader.py to use Meta downloaded model
python run_text_to_file_reader.py

Result:

running on cuda
Now synthesizing: Maître Corbeau, sur un arbre perché, tenait en son bec un fromage.
Traceback (most recent call last):
  File "/opt/app-root/src/IMS-Toucan/run_text_to_file_reader.py", line 132, in <module>
    le_corbeau_et_le_renard(version="NEB_baseline", model_id="Meta", exec_device=exec_device)
  File "/opt/app-root/src/IMS-Toucan/run_text_to_file_reader.py", line 55, in le_corbeau_et_le_renard
    read_texts(model_id=model_id,
  File "/opt/app-root/src/IMS-Toucan/run_text_to_file_reader.py", line 15, in read_texts
    tts.read_to_file(text_list=sentence, file_location=filename)
  File "/opt/app-root/src/IMS-Toucan/InferenceInterfaces/PortaSpeechInterface.py", line 295, in read_to_file
    wav = self(text,
  File "/opt/app-root/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/app-root/src/IMS-Toucan/InferenceInterfaces/PortaSpeechInterface.py", line 172, in forward
    mel, durations, pitch, energy = self.phone2mel(phones,
  File "/opt/app-root/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/app-root/lib64/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/opt/app-root/src/IMS-Toucan/InferenceInterfaces/InferenceArchitectures/InferencePortaSpeech.py", line 357, in forward
    energy_predictions = self._forward(text.unsqueeze(0),
  File "/opt/app-root/src/IMS-Toucan/InferenceInterfaces/InferenceArchitectures/InferencePortaSpeech.py", line 295, in _forward
    predicted_spectrogram_after_postnet = self.run_post_glow(mel_out=before_enriched,
  File "/opt/app-root/src/IMS-Toucan/InferenceInterfaces/InferenceArchitectures/InferencePortaSpeech.py", line 396, in run_post_glow
    x_recon, _ = self.post_flow(z_post, nonpadding, g, reverse=True)
  File "/opt/app-root/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/app-root/src/IMS-Toucan/TrainingInterfaces/Text_to_Spectrogram/PortaSpeech/Glow.py", line 349, in forward
    x, logdet = f(x, x_mask, g=g, reverse=reverse)
  File "/opt/app-root/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/app-root/src/IMS-Toucan/TrainingInterfaces/Text_to_Spectrogram/PortaSpeech/Glow.py", line 123, in forward
    z = F.conv2d(x, weight)
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Ca-ressemble-a-du-fake commented 1 year ago

Did you try https://github.com/DigitalPhonetics/IMS-Toucan/issues/88 ?

thoraxe commented 1 year ago

@Ca-ressemble-a-du-fake Yes, it looks like the change in #88 fixed it. But I'm curious why operating off a specific tag, v2.4, would suddenly break. What changed?

Flux9665 commented 11 months ago

I may have updated the tag, because there was an oversight when I made the release and broke something else when I fixed this. But the broken piece of code was not around for long I think, and I hope by now everything works.