This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Apache License 2.0
4.69k
stars
705
forks
source link
RuntimeError: expand(torch.FloatTensor{[2, 513, 203]}, size=[513, 513]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (3) #467
I don't want to use whisper auto-annotation. I wanted to label the data myself to improve accuracy, but when I did most of the work, it kept reporting this error, I don't know what the reason was, I hope someone can help me.
I don't want to use whisper auto-annotation. I wanted to label the data myself to improve accuracy, but when I did most of the work, it kept reporting this error, I don't know what the reason was, I hope someone can help me.
g image.png…]()