Open andretocalivros opened 6 months ago
Did you modify data_utils.py to add your language labels? Also, 768 and 1024 are BERT embeddings and are related to your BERT model.
Did you modify data_utils.py to add your language labels? Also, 768 and 1024 are BERT embeddings and are related to your BERT model.
Hi Jeremy! Yes, I have modified the data_utils, the problem was only this error when start training. Thanks for the explanation about the BERT model, I will try to find one that can be used with MeloTTS training.
@andretocalivros I'm looking to train a new language as well, do you have any code you could provide as to how you did it ?
@TugdualKerjan Maybe you can refer to https://github.com/myshell-ai/MeloTTS/issues/160 https://github.com/myshell-ai/MeloTTS/issues/120
I am trying to train a new language model (Portuguese) but I am encountering the error "The expanded size of the tensor (768) must match the existing size (1024) at non-singleton dimension 0. Target sizes: [768, 19]. Tensor sizes: [1024, 19]" during the training phase.
Initially, I created a new language (based on Spanish) in the "text" directory, performed the preprocessing, and the BERT and config files were generated successfully. However, when I attempt to train the model, the above error is presented.
For the tokenizer, I used the 'neuralmind/bert-large-portuguese-cased' model, and I am unsure if this might be the problem. The audio files are all in wav format and up to 10 seconds long. Could you please guide me on how to fix this error? I intend to contribute to the training code for Portuguese once I achieve success.
Thank you for your assistance.