NVIDIA / mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
BSD 3-Clause "New" or "Revised" License
854 stars 187 forks source link

Question about the gender of the Libri dataset used by Mellotron mention in the published article #70

Open TitiAffandi opened 4 years ago

TitiAffandi commented 4 years ago

Cool works !

Anyway, I have want to know whether you use female speakers only or both on your train-clean-100 subset of LibriTTS over 100 speakers. For LJS and Sally are quite clear, they are female speakers. However, it is still not clear about gender on the subset 100 of LibriTTS that you are using (train_clean-100 consist ot 123 female and 124 male).

many thanks.

rafaelvalle commented 4 years ago

We used both male and female speakers. You can check the exact list here