Open danablend opened 9 months ago
The VCTK checkpoint is provided in the following link https://drive.google.com/drive/folders/1L8k18QdtN6ew_i-6FjoJyfQCCdvIbSUv?usp=sharing.
If you encounter any problems during use, please feel free to contact us.
Thank you very much!
Hey, just opening this up again.
Would you be able to provide the MFA results from your VCTK run if you have those?
RuntimeError: Error(s) in loading state_dict for GaussianDiffusion: size mismatch for fs.encoder.embed_tokens.weight: copying a param with shape torch.Size([76, 192]) from checkpoint, the shape in current model is torch.Size([80, 192]).
I'm guessing that the number of phonemes changes between LibriTTS and VCTK, so the shape of one of the encoder's layers is mismatched between the checkpoint and the instantiated model when attempting to use the LibriTTS MFA results you provided in one of your previous responses :-)
If you don't have these anymore, no problem, I can always download VCTK and generate them locally with MFA.
Thanks! :-)
@danablend Have you solved this problem? This problem also arises when I continue to train with LibriTTS on my trained vctk checkpoint, it seems that the number of "num_embedding" does not match; Cause "fs.encoder.embed_tokens.weight" changes, I do not know how to solve. Thank you!
@Zain-Jiang
Hi, Have you encountered this problem before?
RuntimeError: Error(s) in loading state_dict for GaussianDiffusion: size mismatch for fs.encoder.embed_tokens.weight: copying a param with shape torch.Size([76, 192]) from checkpoint, the shape in current model is torch.Size([80, 192]).
I wonder if you can help us, thank you!
I suddenly realized if we have to train all the data together (Libritts+vctk), by which I mean together through the MFA, instead of training one separately and then training the next? @Zain-Jiang
And I can share my VCTK MFA results for you, whether you need it now? @danablend
That is a great insight @Linghuxc . Would you be able to share your MFA files?
@danablend
I'm very sorry for replying you so late. This week I tried to upload the results of my data/processed
and data/binary
to the cloud, but I found that it was too big and my space was not that big, and I tried several platforms but failed to upload.
I'm really sorry to keep you waiting so long, I was trying to upload even today
Thanks for the great work on this repository, really useful!
Wondering if there is a VCTK checkpoint that could be accessed, for use with speakers with UK accent?
Again thanks for this repository!