I plan to fine-tune my own dataset based on the AISHELL3 model, but my dataset only has 6 speakers, while AISHELL3 has 218. When loading the model, an error occurred due to the size mismatch. Additionally, Baker dataset only has one speaker, which also doesn't match with AISHELL3. I wonder how the author dealt with this issue?
the repo is based on the Baker dataset, so speaker is limited to one, if u want more speakers to train, you can move to https://github.com/ming024/FastSpeech2
I plan to fine-tune my own dataset based on the AISHELL3 model, but my dataset only has 6 speakers, while AISHELL3 has 218. When loading the model, an error occurred due to the size mismatch. Additionally, Baker dataset only has one speaker, which also doesn't match with AISHELL3. I wonder how the author dealt with this issue?