Open 770navyasharma opened 1 day ago
Thank you for creating this issue! ๐ We'll look into it as soon as possible. In the meantime, please make sure to provide all the necessary details and context. Your contributions are highly appreciated! ๐
@770navyasharma Provide a pipeline of the work which you're gonna do and upload the necessary files. if it's verified, it'll be merged with main branch
@770navyasharma I'm binding with @SaiNivedh26's suggestion. Also, just to add context as reference, OpenAI does ship their own variant of tts (possibly tts-1 can be utilized here), but it's extensively resource constrained (EN
, HI
, BN
, & a few) and lacks a diverse range of indic-languages. So, Azure AI Speech is a better fit, but it takes a bit longer time to process.
While, for the first section of pipeline, you'll have to benchmark which whisper's s2t (model variant) will be better, as there you might have to compromise with inference speed and processing accuracy.
@SaiNivedh26 At a base level I made a basic translator model using google translate and gtts which actually provides support for various indian languages Here is the demo of the result .
If you approve then I can add the python file for the same by opening the pull request ...
@770navyasharma Proceed to PR. Will review the files in that and let you know about merging it with main branch. But make sure to attach a .README file to discuss in detail about your structure and The result which you've obtained
This repository is an absolute gem for those diving into machine learning and working on innovative projects. ๐ To make it even more powerful, I'd love to contribute by adding a real-time translation model covering more than 10 languagesโfocusing especially on Indian regional languages. This feature will include both speech-to-text and text-to-speech integrations, all wrapped in an interactive Streamlit app for seamless user experience.
What will this feature add? ๐ Real-Time Translation across multiple languages. ๐ค Speech-to-Text: Users can speak in their native language, and the model will transcribe the audio in real-time. ๐ฃ๏ธ Text-to-Speech: The translated text can also be played back in the target language. ๐ Focus on Indian regional languages to cater to a diverse audience. ๐๏ธ User-Friendly Interface using Streamlit for smooth interaction and accessibility. How I Plan to Build This: Implement a multilingual model capable of real-time translation using state-of-the-art NLP techniques. Integrate Speech-to-Text using APIs such as Google Cloud Speech or Whisper. Add Text-to-Speech support for multiple languages using libraries like gTTS or Azure Speech. Deploy the solution on Streamlit, making it easily accessible via web for demo purposes. Why This Matters: Machine learning is not just about models but also about making them accessible and practical for everyday users. A real-time translation app opens doors for cross-language communication, helping people connect in ways never imagined before. ๐ Plus, focusing on Indian languages will promote inclusivity and bridge the linguistic gap in technology.
๐ก Iโm excited to work on this! Kindly assign me this task under the following tags:
hacktoberfest ๐ gssoc ๐ฉโ๐ป level ๐ Looking forward to contributing and taking this repository to the next level! ๐ช
cc: @UppuluriKalyani