Closed doublex closed 3 years ago
Many thanks, I saw it some time ago, we are working on it albeit with a low priority
Theoretically we will even be able to make a vad out of it as well
done in https://github.com/snakers4/silero-vad/commit/395885b06b408b9ca0b84dcf05a42d8e8be59153 more data was used probably will exclude some artificial or unspoken languages and train a bigger model
wow! this is great news.... I have been working on a language classifier using the common voice dataset, but I found it pretty hard to get a satisfying validation accuracy even on four languages. What is your validation accuracy?
I have been using 5s samples, STFT and classified them the small ATTRNN used in here Do you have a tip on how to solve this task?
hard to get a satisfying validation accuracy even on four language What is your validation accuracy?
We had 99%+ provided they were quite different (en, ru, de, es) Though we just did random split, without regarding the speakers The datasets are large enough not to care
For 100+ languages there are still some unresolved issues, i.e. English having low accuracy and mutually intelligible languges having orders of magnitude differences in available data
Do you have a tip on how to solve this task?
Just use our models If you need higher quality for some particular cases - please dm for commercial inquiries
thanks for the quick reply! When i did not care for speakers my acc was about 95% but failed hard in a real life scenario. After i fixed the speakers issue, acc drastically decreased (85%) but real life performance is almost OK now. I used 30k samples per language. I noticed there are flawed samples from cutting with Acoustic Audio Detection (auditok), so I checked for a VAD and ended up here. Great work! I ll try to use your VAD for cleaner cuts on the samples.
I used 30k samples per language. After i fixed the speakers issue, acc drastically decreased (85%) but real life performance is almost OK now.
also in domain / out of domain may be an issue if your dataset is not diverse enough (not enough augs) as for language classifier, most likely we will update it soon, there are some obvious improvements
This project has audio-samples for 107 languages: http://bark.phon.ioc.ee/voxlingua107/ Would be great to improve the
Language Classifier