A-Jacobson / tacotron2

pytorch tacotron2 https://arxiv.org/pdf/1712.05884.pdf
43 stars 15 forks source link

RuntimeError: expand(torch.cuda.FloatTensor{[12, 1]}, size=[12]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2) #3

Closed rajanieprabha closed 5 years ago

rajanieprabha commented 5 years ago

Hi, I am encountering this error in the decoding-helpers file. When I googled, it suggested me to install pytorch from source and so I did. It is still showing me this error. Can you help me? stop_tokens[t] = stop_token RuntimeError: expand(torch.cuda.FloatTensor{[12, 1]}, size=[12]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

Thanks!

rajanieprabha commented 5 years ago

Also, it doesn't throw error when I run with batch_size 1 but my attention is not learning even after 6k steps. :/

rajanieprabha commented 5 years ago

possible bug: doesnt work with pytorch 0.4.0, works with 0.3.1

A-Jacobson commented 5 years ago

This hasn't been converted to 0.4 so you are likely correct about the source of your pain.

rajanieprabha commented 5 years ago

Also, I am running with batch size = 8 on German dataset. It's been over 10k steps but the alignments are still all over the place. All other hyperparameters are similar to yours. I don't understand what am I missing.

A-Jacobson commented 5 years ago

hyperparameters aren't transferable between datasets. You may want to try different learning rates. If it's german you also may want to adjust the text preprocessing vocabulary, mine is manually set in hyperparams.py you may want to also double check your audio preprocessing steps depending on the sampling rate of your audio. This project was only tested on LJ speech and was meant to be very simple. r9y9's implementation may have more dynamic preprocessing built in.

rajanieprabha commented 5 years ago

Okay. I will check it out. Thank you :)