NVIDIA / mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
BSD 3-Clause "New" or "Revised" License
854 stars 187 forks source link

Funky warble on sustained sung notes from a MUSICXML file #56

Open josharmenta opened 4 years ago

josharmenta commented 4 years ago

Getting a strange (ca. 40hz) vibrato on any note longer than half a second. Any suggestions?

josharmenta commented 4 years ago

@rafaelvalle Any thoughts? Longer training time?

rafaelvalle commented 4 years ago

Is it speaker specific?