Closed egorsmkv closed 4 years ago
Also we have own repository where weโre collecting links to datasets: https://github.com/egorsmkv/speech-recognition-uk
Hi,
This is exactly the effort I would expect from the community for low-resource languages For now - I will just fit a model on your data as is and share the model via silero-models Then when V3 compact models arrive for all languages, I will consider tuning a Russian model on your corpus
Just a few ideas on how to make your repo better:
Thanks for your suggestions!
I updated the models and purged the CDN cache.
It is unlikely that much will change soon, so if everything works let's close the ticket.
๐ Feature
We would like to have a Ukrainian model for the task of Speech-to-Text.
Motivation
Ukraine has a large population and in the country and there are tons of tasks related to Speech-to-Text.
Additional context
Our group that is based in Telegram ( https://t.me/speech_recognition_uk ) collected a dataset of Ukrainian public speeches/interviews in audio and text formats accessed here: https://mega.nz/folder/T34DQSCL#Q1O8vcrX_8Qnp27Ge56_4A/folder/O3hzlKIJ
We think this dataset will be helpful in the training process.