microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
MIT License
1.09k stars 113 forks source link

The size of tensor a (674) must match the size of tensor b (600) at non-singleton dimension 1 #64

Open poojitharamachandra opened 8 months ago

poojitharamachandra commented 8 months ago

hi,

I am following the tutorial given in https://huggingface.co/blog/speecht5 to convert text to speech. but facing the below problem, when the text is lengthy. "The size of tensor a (674) must match the size of tensor b (600) at non-singleton dimension 1"

is it because the length of the text is more than 600chars? How do I handle this?

the config file contains "max_length": 1876, "max_speech_positions": 1876, in microsoft t5_tts

and

"model_max_length": 600, in tokenizer_config.json

can I change these values?

cparello commented 3 months ago

most likely have to do it by sentence and combine