common-voice / common-voice

Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
https://commonvoice.mozilla.org/
Mozilla Public License 2.0
3.31k stars 843 forks source link

[FR] Add text-corpus related statistics to the panel #4196

Open HarikalarKutusu opened 1 year ago

HarikalarKutusu commented 1 year ago

Is your feature request related to a problem? Please describe. Text-corpus generation is the most important and troublesome part of the dataset and many language communities are failing to extend it. The goals and statistics for voice-corpus are very nicely designed and give near-optimal values one can follow.

Describe the solution you'd like Extend the panel to include personal and language-based text-corpus creation and validation, along with the existing ones.

Additional context One good addition for "creation" parts (record & write) would be feedback to the user, on how many of their contributions are accepted and rejected. A validity percent would suffice I think...

jessicarose commented 1 year ago

This is an extremely good idea. I'll add this feature request to our internal discussions around how we show and track data for users. To appropriately set expectations, we have a reasonably full backlog and roadmap, but I'm always so excited to introduce great ideas into the conversation. Thank you so much for the time and thought you've put into this suggestion!