UppuluriKalyani / ML-Nexus

ML Nexus is an open-source collection of machine learning projects, covering topics like neural networks, computer vision, and NLP. Whether you're a beginner or expert, contribute, collaborate, and grow together in the world of AI. Join us to shape the future of machine learning!
https://discord.gg/fy8MQkCh
MIT License
19 stars 24 forks source link

โœจ Feature Request: Real-Time Multilingual Translation with Speech-to-Text & Text-to-Speech Integration ๐ŸŒ #80

Open 770navyasharma opened 1 day ago

770navyasharma commented 1 day ago

This repository is an absolute gem for those diving into machine learning and working on innovative projects. ๐Ÿš€ To make it even more powerful, I'd love to contribute by adding a real-time translation model covering more than 10 languagesโ€”focusing especially on Indian regional languages. This feature will include both speech-to-text and text-to-speech integrations, all wrapped in an interactive Streamlit app for seamless user experience.

What will this feature add? ๐Ÿ”„ Real-Time Translation across multiple languages. ๐ŸŽค Speech-to-Text: Users can speak in their native language, and the model will transcribe the audio in real-time. ๐Ÿ—ฃ๏ธ Text-to-Speech: The translated text can also be played back in the target language. ๐ŸŒ Focus on Indian regional languages to cater to a diverse audience. ๐ŸŽ›๏ธ User-Friendly Interface using Streamlit for smooth interaction and accessibility. How I Plan to Build This: Implement a multilingual model capable of real-time translation using state-of-the-art NLP techniques. Integrate Speech-to-Text using APIs such as Google Cloud Speech or Whisper. Add Text-to-Speech support for multiple languages using libraries like gTTS or Azure Speech. Deploy the solution on Streamlit, making it easily accessible via web for demo purposes. Why This Matters: Machine learning is not just about models but also about making them accessible and practical for everyday users. A real-time translation app opens doors for cross-language communication, helping people connect in ways never imagined before. ๐ŸŒ Plus, focusing on Indian languages will promote inclusivity and bridge the linguistic gap in technology.

๐Ÿ’ก Iโ€™m excited to work on this! Kindly assign me this task under the following tags:

hacktoberfest ๐ŸŽ‰ gssoc ๐Ÿ‘ฉโ€๐Ÿ’ป level ๐Ÿ“ˆ Looking forward to contributing and taking this repository to the next level! ๐Ÿ’ช

cc: @UppuluriKalyani

github-actions[bot] commented 1 day ago

Thank you for creating this issue! ๐ŸŽ‰ We'll look into it as soon as possible. In the meantime, please make sure to provide all the necessary details and context. Your contributions are highly appreciated! ๐Ÿ˜Š

SaiNivedh26 commented 21 hours ago

@770navyasharma Provide a pipeline of the work which you're gonna do and upload the necessary files. if it's verified, it'll be merged with main branch

Neilblaze commented 20 hours ago

@770navyasharma I'm binding with @SaiNivedh26's suggestion. Also, just to add context as reference, OpenAI does ship their own variant of tts (possibly tts-1 can be utilized here), but it's extensively resource constrained (EN, HI, BN, & a few) and lacks a diverse range of indic-languages. So, Azure AI Speech is a better fit, but it takes a bit longer time to process.

While, for the first section of pipeline, you'll have to benchmark which whisper's s2t (model variant) will be better, as there you might have to compromise with inference speed and processing accuracy.

770navyasharma commented 12 hours ago

@SaiNivedh26 At a base level I made a basic translator model using google translate and gtts which actually provides support for various indian languages Here is the demo of the result .

If you approve then I can add the python file for the same by opening the pull request ...

Screenshot 2024-10-04 232059 Screenshot 2024-10-04 232107

SaiNivedh26 commented 11 hours ago

@770navyasharma Proceed to PR. Will review the files in that and let you know about merging it with main branch. But make sure to attach a .README file to discuss in detail about your structure and The result which you've obtained