Source of Text Prompts - Githubissues

thorstenMueller / Thorsten-Voice

Thorsten-Voice: A free to use, offline working, high quality german TTS voice should be available for every project without any license struggling.

http://www.thorsten-voice.de

Creative Commons Zero v1.0 Universal

545 stars 51 forks source link

Source of Text Prompts #42

Closed liaeh closed 1 year ago

liaeh commented 1 year ago

Hi, great work and great project! 👍

I am interested in how you curated the text prompts for the recordings in your dataset. For example, from where did you source the prompts? Did you collect prompts from multiple domains? Did you select all the prompts you found in a source, a random subset, or did you filter some out in a pre-processing step?

Thanks for any hints! Much appreciated.

thorstenMueller commented 1 year ago

Hi, thanks for your nice feedback 😊. I've taken most phrases from Mozilla Common Voice (Sentence Collector). I hope this helps you.