NVIDIA / mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
BSD 3-Clause "New" or "Revised" License
853 stars 187 forks source link

Try to train some new words #108

Open zfishbone01 opened 2 years ago

zfishbone01 commented 2 years ago

hello I am a newer to mellotron, while I want to apply this model to singing songs with sol-fa syllables, and I fond it appear bad when song lyric use sol-fa syllables. So I try to re-train this model with pretrain, using some no background music singing data, total about 7 minutes, 6 children songs, but finally bad results. So are you have some advice to improve this? Is my data insufficient ? Thank you!