KaniyamFoundation / ProjectIdeas

A Place to write down the project ideas and to plan them
37 stars 3 forks source link

Building Text to Speech or Speech to text applications using Common Voice Tamil Data #164

Open tshrinivasan opened 2 years ago

tshrinivasan commented 2 years ago

At Mozilla Common Voice, we are creating dataset with text and relevant audio data.

We can download the dataset here https://commonvoice.mozilla.org/ta/datasets

It has currently 14 hours of data for tamil .

Is this enough to train any model for TTS or STT models for tamil?

Try to build models with the current data and explore the results.

There is a existing Speech to text demo with this data here https://huggingface.co/Rajaram1996/wav2vec2-large-xlsr-53-tamil

Can we get a text to speech version like the same above?