nguyenhoanganh2002 / XTTSv2-Finetuning-for-New-Languages

29 stars 5 forks source link

got error when create and use new vocab.json file #10

Open phvaha1 opened 5 days ago

phvaha1 commented 5 days ago

@nguyenhoanganh2002
Hello, thanks for your work. I had formatted the training data as format you provided, and create new vocab.json file based on that data.

But when fine tuning xtts model from pretrained checkpoint (the default checkpoint as in document) with new vocab.json file, I got below error:

size mismatch for gpt.text_embedding.weight: copying a param with shape torch.Size([6681, 1024]) from checkpoint, the shape in current model is torch.Size([7767, 1024])

I think it is because new vocabs size is 7767 but checkpoint has size 6681. Do you know how to fix this?

nguyenhoanganh2002 commented 2 days ago

@nguyenhoanganh2002 Hello, thanks for your work. I had formatted the training data as format you provided, and create new vocab.json file based on that data.

But when fine tuning xtts model from pretrained checkpoint (the default checkpoint as in document) with new vocab.json file, I got below error:

size mismatch for gpt.text_embedding.weight: copying a param with shape torch.Size([6681, 1024]) from checkpoint, the shape in current model is torch.Size([7767, 1024])

I think it is because new vocabs size is 7767 but checkpoint has size 6681. Do you know how to fix this?

Could you provide complete error traceback? As mentioned in the repository, it's crucial to adjust the configuration file. Specifically, please check the line referenced at: https://github.com/nguyenhoanganh2002/XTTSv2-Finetuning-for-New-Languages/blob/main/extend_vocab_config.py#L82

TugdualKerjan commented 1 day ago

I'm currently getting the same error, I believe this could be due to the fact that the model that is being loaded isn't the correct one ? Unsure, will come back with any other clues I find ! 🔍

TugdualKerjan commented 1 day ago

It seems like since the tokenizer is expanded but not the model, we end up with this error - @nguyenhoanganh2002 are you sure your implementation isn't missing code concerning the extension not just of the tokenizer but also the model that takes in the tokens ?

What's your tactic for extending the model ? Just initalizing weights with a gaussian normal for the embedding layers ?

TugdualKerjan commented 1 day ago

It seems like since the tokenizer is expanded but not the model, we end up with this error - @nguyenhoanganh2002 are you sure your implementation isn't missing code concerning the extension not just of the tokenizer but also the model that takes in the tokens ?

What's your tactic for extending the model ? Just initalizing weights with a gaussian normal for the embedding layers ?

TugdualKerjan commented 1 day ago

@phvaha1 Issue fixed for me, it wasn't a bug but was simply using the https://github.com/idiap/coqui-ai-TTS instead of the one in this git repo - if you look in the commits you'll notice they modify the model param size in this one.

2fb571637b20718647f9080b189c4a3f646e2d1a

Simply download the whole git / reference the TTS library custom made here.

nguyenhoanganh2002 commented 1 day ago

Thank you for your feedback and investigation, @TugdualKerjan. I'll investigate and aim to fix it soon. I'll update this thread once I have more information or when the fix is implemented.