UppuluriKalyani / ML-Nexus

ML Nexus is an open-source collection of machine learning projects, covering topics like neural networks, computer vision, and NLP. Whether you're a beginner or expert, contribute, collaborate, and grow together in the world of AI. Join us to shape the future of machine learning!
https://ml-nexus.vercel.app/
MIT License
69 stars 123 forks source link

Voice-to-text conversion #617

Closed ananyag309 closed 1 month ago

ananyag309 commented 1 month ago

Description This project aims to build a speech recognition model that can convert spoken language (audio input) into written text. The model uses techniques from Natural Language Processing (NLP) and deep learning to process audio data and predict corresponding text. It is based on the principles of speech-to-text algorithms and Recurrent Neural Networks (RNNs).

Model Architecture The model consists of RNN layers (such as LSTM or GRU) for processing the sequence data. The final layer is a dense layer with a softmax activation for predicting the probability distribution over the vocabulary. Connectionist Temporal Classification (CTC) loss function is used to handle the alignment between input audio sequences and output text sequences.

github-actions[bot] commented 1 month ago

Thanks for creating the issue in ML-Nexus!πŸŽ‰ Before you start working on your PR, please make sure to:

UppuluriKalyani commented 1 month ago

@ananyag309 arey ananaya you are raising already existed one's please check and raise

github-actions[bot] commented 1 month ago

Hello @ananyag309! Your issue #617 has been closed. Thank you for your contribution!

ananyag309 commented 1 month ago

@UppuluriKalyani We have text to speech, there are none for speech to text.