Avdhesh-Varshney / chanakya-niti

A digital platform leveraging AI to provide accessible and engaging content on the teachings of Chanakya, an ancient Indian philosopher.
https://chanakya-niti.vercel.app
MIT License
21 stars 51 forks source link

Multilingual Text-to-Speech and Speech-to-Text Functionality #85

Closed ShampaShaw closed 1 week ago

ShampaShaw commented 2 weeks ago

This feature enables users to interact with the website using text-to-speech and speech-to-text functionalities in multiple languages. Users can select their preferred language for both text-to-speech output and speech-to-text input, catering to a diverse user base and enhancing accessibility. This feature will integrate with the backend to support language-specific models and ensure accurate and efficient processing of text and speech data.

Use Case Problem Statement: Many users face language barriers and accessibility challenges when interacting with websites. Non-native speakers and users with disabilities may find it difficult to read or type in a single language interface, leading to frustration and decreased engagement.

Use Case Description: Objective: To provide a personalized, accessible, and inclusive user experience by offering multilingual text-to-speech and speech-to-text capabilities.

Multilingual Text-to-Speech:

Current State: The website lacks the ability to read text aloud in different languages, limiting accessibility for users who prefer auditory content consumption. Future State: Users can select their preferred language from a list of supported languages, and the website will read aloud the text in the selected language. This is useful for users with visual impairments, language learners, and those who prefer listening over reading. Multilingual Speech-to-Text:

Current State: Users can only input text through typing, which may not be convenient for everyone, especially those with physical disabilities or those who prefer speaking over typing. Future State: Users can speak into their device’s microphone, and the website will convert their speech into text in their selected language. This feature will support voice commands and dictation, enhancing usability and efficiency. Benefits: Enhanced Accessibility: Improves the experience for users with visual impairments or physical disabilities, making the website more inclusive. User Convenience: Offers a more natural way for users to interact with the website, particularly for those who prefer auditory input/output. Language Support: Accommodates a diverse user base by supporting multiple languages, making the website accessible to non-native speakers and language learners. Improved Engagement: By catering to individual language preferences and accessibility needs, users are more likely to engage with the content and services provided by the website. User Story: As a user, I want to listen to the website content in my preferred language, so I can easily understand and consume information. As a user, I want to use speech-to-text in my native language, so I can interact with the website more comfortably and efficiently. Implementation Goals: Language Selection: Provide an easy-to-use interface for users to select their preferred language for both text-to-speech and speech-to-text. Backend Integration: Connect with backend services that support multiple languages for text-to-speech synthesis and speech-to-text recognition. Accurate Processing: Ensure high accuracy in language processing to provide a smooth user experience. Responsive Design: Make the feature accessible across various devices, including desktops, tablets, and mobile phones. Security and Privacy: Implement robust security measures to protect user data during voice interactions. By incorporating multilingual text-to-speech and speech-to-text functionalities, this feature aims to create a more inclusive, accessible, and user-friendly platform, ultimately enhancing user satisfaction and engagement.

ShampaShaw commented 2 weeks ago

Assign me this issue under GSSoC'24

Avdhesh-Varshney commented 2 weeks ago

@ShampaShaw this project currently not having any books for reading purpose? So, you have to go with the language conversion model.