NVIDIA / mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
BSD 3-Clause "New" or "Revised" License
853 stars 187 forks source link

Training time #110

Open Rongjiehuang opened 2 years ago

Rongjiehuang commented 2 years ago

I wonder how long it takes to train an acceptable multi-speaker mellotron? With a single Nvidia 1080Ti.