yaya-sy / SpeechAya

0 stars 0 forks source link

Add French common voice #4

Closed yaya-sy closed 2 months ago

yaya-sy commented 2 months ago

Extract the French subset in the CommonVoice dataset (https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0) The resulting data must look like this:

{
   'audio': the audio file,
   'text": the transcription of the audio
}

Then you can push the dataset to HuggingFace.