NeonGeckoCom / neon-tts-plugin-coqui

Coqui AI TTS plugin
https://huggingface.co/spaces/neongeckocom/neon-tts-plugin-coqui
Other
65 stars 5 forks source link

How can I add an audio dataset for Meadow Mari language? #87

Open fu-lab opened 1 year ago

fu-lab commented 1 year ago

I have a set of audio data (audio and transcription in Cyrillic) with a male voice in Meadow Mari: [https://cloud.mail.ru/public/VAKT/WwWiTXYTC]. What should I do so that you also support our language?

NeonClary commented 1 year ago

Hello @fu-lab, I am adding Meadow Mari to the list of languages we plan to add support for.

From my preliminary checking, that audio archive may be right for our process. The sound quality and voice is very clear, it's one person, and he is speaking in a normal way. I have just one concern that I will check with our team about. I am not certain that having only short speech samples will work. We have used longer speech samples in the past, and all of your samples that I listened to were 10 seconds or less. I didn't see a way to sort your files by size, so perhaps I missed some longer samples. I will ask our team, maybe short samples will be fine since the total amount recorded is still good.

I read that Meadow Mari has an extra letter, a special "ҥ", and a few other rare linguistic features. Sometimes things like that make it tricky to build a language. We will still plan to do it, but I want to tell you in advance that it may be more difficult. When we start working on it, it will be important to have you or another native speaker available to listen to samples and talk to us about the language. Can you share an email I could use to contact you directly for that? You can send it to me at clary@neon.ai

Our team has discussed what's most efficient for our resources, and we'd like to do several language requests at once. We plan to do it after finishing the project our STT/TTS team is working on right now, and before starting the next one. That shouldn't be very long. I will tell you when we get ready to start working on Meadow Mari.

fu-lab commented 1 year ago

Длинные предложения находятся тут: https://cloud.mail.ru/public/YCkw/fpBN7nbrr

С уважением, Андрей chemyshev.andrey@gmail.com

NeonClary commented 1 year ago

That's very helpful, thank you! I'll send you an email as soon as we are ready to start. Это очень полезно, спасибо! Я пришлю вам письмо, как только мы будем готовы начать.

nfaraji2002 commented 1 year ago

We have trained a pytorch Coqui TTS model for Persian language. How could we convert the model to a TFLite one to be executed on Raspberry 4? Is there any other ways without conversion for running in real-time? Thanks for your support.

JohnClaw commented 1 year ago

That's very helpful, thank you! I'll send you an email as soon as we are ready to start.

Did you start to develop Meadow Mari tts?

NeonClary commented 1 year ago

I wish I could tell you we have. We're a very small team, and have had to prioritize other projects for now. We hope to get back to adding STT & TTS soon, and to find a larger organization willing to provide us some additional GPU time for it. If you're interested in helping out with that or anything else, please send me an email at clary@neon.ai. I do want to assure you that we haven't forgotten about you, and we did add Meadow Mari to our planned languages.