MozillaItalia / DeepSpeech-Italian-Model

Tooling for producing Italian model (public release available) for DeepSpeech and text corpus
GNU General Public License v3.0
94 stars 20 forks source link

voxforge importer through the official one #121

Closed nefastosaturo closed 3 years ago

Mte90 commented 3 years ago

Looking that there are a lot of files to ignore maybe is better to do an external file so is more easy to update and the script will loop the content?

nefastosaturo commented 3 years ago

Yes, the idea is, one day, to port the importer into the corpora utilities from MITADS-speech. Right now there are around ~3.5M of blacklisted archives so not so much (just 4 speakers).

IMO, I will continue to use the already made voxforge script and when the complexity will increase (more blacklisted speakers #111 , more "hijacking" stuff into the original code) we will move everything into to the speech collector