[RFW0102]: Crowd sourcing data collection through monlam webapp
Named Concepts
bhashadaan: It is project to collect data collection for machine learning model like MT,STT,TTS and OCR for indian language. You can check here
Summary
We need crowd sourcing data collection via a webapp.
Context
Since we have explained the need of training data for our models to general public via our launch, it is high time to get thelp from general public to create training data for us. Inorder to do that, we need a platform in our webapp through which anyone can create need training data or reviewed labelled data. We can refer bhasadaan project which is executed for indian language, common voice for stt and tts.
We can reuse the existing pecha tools UI.
Expected Output
Web Interface which will load all the unannotated data to get them annotated.
Web interface which will load all the annotated data to get reviewed.
In case the user is not able to translate or transcribe or read, they will be needing a button to skip.
In case of TTS we need to ask user to choose which accent they are speaking.
Input
Data which needs to be translated in case of MT.
Data which needs to be transcribe in case of OCR and STT.
[RFW0102]: Crowd sourcing data collection through monlam webapp
Named Concepts
bhashadaan: It is project to collect data collection for machine learning model like MT,STT,TTS and OCR for indian language. You can check here
Summary
We need crowd sourcing data collection via a webapp.
Context
Since we have explained the need of training data for our models to general public via our launch, it is high time to get thelp from general public to create training data for us. Inorder to do that, we need a platform in our webapp through which anyone can create need training data or reviewed labelled data. We can refer bhasadaan project which is executed for indian language, common voice for stt and tts. We can reuse the existing pecha tools UI.
Expected Output
Input
Expected Timeline
29th Feb, 2024
References
Bhasha Daan Common Voice Google contribute