what are the training datasets?

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

MIT License

7.68k stars 762 forks source link

what are the training datasets? #96

Open jasonppy opened 1 year ago

jasonppy commented 1 year ago

Thanks for making this available!

What are the datasets that you used for modeling training (for the released checkpoints)?

chazo1994 commented 1 year ago

I also want to know what dataset are used to train this model.

alinrdinn commented 1 year ago

Based on [https://plachtaa.github.io/]() , the datasets are LibriTTS + author self-gathered (704 hours) (for english), Aishell 1, 3, Aidatatang + self-gathered (598 hours) (for chinese), and JP commonvoice + self-gathered (437 hours) (for japanese).