IDinsight / aaq-core

No-code, easy-to-setup, reliable content manager and RAG plugin for chatbots in social sector
https://idinsight.github.io/aaq-core/
BSD 3-Clause "New" or "Revised" License
19 stars 4 forks source link

[DMP 2024]: Voice API #128

Open suzinyou opened 3 months ago

suzinyou commented 3 months ago

Ticket Contents

Description

[Provide a brief description of the feature, including why it is needed and what it will accomplish.] Ask A Question is a free and open-source tool created to help non-profit organizations, governments in developing nations, and social sector organizations utilize Large Language Models for responding to citizen inquiries in their native languages.

Create new voice response API: the API will allow users to send questions and receive responses from AAQ using voice notes. This will increase the accessibility of AAQ to users for whom speaking/listening is easier than writing/reading.

Goals & Mid-Point Milestone

Goals

By mid-point

By project end

For every goal listed, there will be a few rounds of design-feedback-implementation with support from the mentors and wider AAQ team.

Setup/Installation

AAQ contribution guide is here: https://idinsight.github.io/aaq-core/develop/contributing/

You will be given access to our testing environment on AWS.

Expected Outcome

  1. AAQ users can query the voice endpoints for voice questions and/or voice response. This can be seamlessly integrated into AAQ’s chat flow manager of choice, Typebot.io.
  2. AAQ users have an option to use an open-source TTS/STT model instead of an external API.

Acceptance Criteria

No response

Implementation Details

You will build the APIs in our core_backend component, which is built in Python, using FastAPI.

Our database is PostgreSQL + pgvector for managing document embeddings (contents) as well as other transactional data.

For the TTS/STT service that serves open-sourced models, you will make it as platform-agnostic as possible, which often means using Docker, but the integration will be to AWS, as our demo environment sits in AWS. You will be able to lead the architecture design for such a service. Of course, our mentors and the wider AAQ team will be available to support and think it through together.

Mockups/Wireframes

No response

Product Name

Ask A Question

Organisation Name

IDinsight

Domain

Open Source Library

Tech Skills Needed

AWS, Database, Python

Mentor(s)

@amiraliemami @lickem22 are Data Scientists at IDinsight!

Category

API, Backend, Database, Delpoyment, AI

MustafaAkolawala commented 3 months ago

Hi @amiraliemami @lickem22 ,

I'm very interested in contributing to your project to add voice response capabilities to the Ask A Question (AAQ) chatbot. As an experienced backend developer with an internship at apnabot I have with expertise in integrating AI/ML models, databases, and cloud deployments, I believe I can help implement the text-to-speech, speech-to-text, and in-house TTS service you're looking to build. I'd welcome the chance to discuss how I can support the development and integration of these voice features into the existing AAQ infrastructure. Please let me know the best way for me to connect with your team and explore opportunities to collaborate on this exciting enhancement.

suzinyou commented 2 months ago

Thanks @MustafaAkolawala ! Would love to see any proposed approach. Feel free to continue on this issue thread.

Also, this project is in fact part of Code4GovTech's Dedicated Mentoring Program -- see here.

MustafaAkolawala commented 2 months ago

hello @suzinyou !

After thoroughly reviewing the various open-source TTS options, I'm convinced that ESPnet-TTS is the way to go for the AAQ voice response API project. You see, ESPnet-TTS is this super flexible, end-to-end speech processing toolkit that just fits the bill perfectly. Not only does it support the specific languages on AAQ's roadmap - Xhosa, Zulu, Hindi, and Igbo - but its modular architecture makes it easy to integrate and customize. That's crucial, given the project's need for a tailored TTS solution.

But what really seals the deal for me is that ESPnet-TTS is an actively maintained open-source project, backed by a strong community. That means you'll have ongoing improvements and the potential to expand language support down the line, as your user base grows. And the fact that it's Python-based, just like the AAQ backend, It'll make the integration process a breeze and reduce the learning curve for the dev team.

In short, ESPnet-TTS ticks all the boxes - from language support to technical alignment - to be the optimal TTS solution for this project. Although ESPnet-TTs requires some good technical knowledge to implement, i will dive deep into it and get myself familiar with it .

And yes i will be writing a detailed project proposal on this project for this year's C4GT :)

ashuashutosh2211 commented 2 months ago

Hello @suzinyou, I'm Ashutosh, a prefinal year student at IIT Jodhpur, specializing in Artificial Intelligence and Data Science. I have a strong proficiency in programming languages such as Python and C++. My experience includes working on diverse projects spanning machine learning and deep learning, including endeavors like Stock Price Prediction and Speech-to-Text Transcription. In addition to my programming skills, I have hands-on experience with various databases such as SQL (MySQL), Document-Oriented Databases (MongoDB), and Graph Databases (Neo4j). One notable project where I applied these skills is the development of a Video Search Engine.

I'm keenly interested in contributing to projects within your domain.

MustafaAkolawala commented 2 months ago

hey! @suzinyou @amiraliemami @lickem22

I just wanted to know is there any way I can contribute to this project before C4GT starts? cause I am really inclined to work towards this project as soon as possible

kannanb2745 commented 2 months ago

Hello @suzinyou

I'm KANNAN B, a second-year B.Tech student majoring in Information Technology at Veltech Hightech Engineering College. I'm thrilled about the opportunity to contribute to your project, particularly in enhancing the Ask A Question (AAQ) chatbot with voice response capabilities. With my experience in backend development and AI/ML integration, I'm confident in my ability to assist in implementing the text-to-speech, speech-to-text, and in-house TTS service. I'm particularly excited about leveraging ESPnet-TTS for its versatility and alignment with the project's goals. I'm eager to dive into the technical aspects and contribute to the project's success. Looking forward to collaborating with your team!

Best regards, KANNAN B

DhruvLamba commented 2 months ago

Hello @suzinyou , I am thrilled to have the opportunity to work on the Ask A Question (AAQ) project under your mentorship. As a cloud computing student with a passion for leveraging technology to address social challenges, I believe I bring a unique blend of skills and enthusiasm to the table.

Firstly, my academic background in cloud computing has equipped me with a solid understanding of AWS services, which will be crucial for integrating the voice response API into AAQ's infrastructure on AWS. I am confident in my ability to navigate AWS environments efficiently and effectively.

Moreover, my proficiency in Python aligns well with the project's tech stack, particularly in developing APIs using FastAPI and working with PostgreSQL databases. I have hands-on experience in building backend systems, which will be invaluable for implementing the API endpoints and integrating the TTS service seamlessly into AAQ's core_backend component.

vivekkumarsoni123 commented 2 months ago

Hello @suzinyou, I am thrilled to have the opportunity to work on the Ask A Question (AAQ) project under your mentorship. As a cloud computing student with a passion for leveraging technology to address social challenges, I believe I bring a unique blend of skills and enthusiasm to the table.

Firstly, my academic background in cloud computing has equipped me with a solid understanding of AWS services, which will be crucial for integrating the voice response API into AAQ's infrastructure on AWS. I am confident in my ability to navigate AWS environments efficiently and effectively.

Moreover, my proficiency in Python aligns well with the project's tech stack, particularly in developing APIs using FastAPI and working with PostgreSQL databases. I have hands-on experience in building backend systems, which will be invaluable for implementing the API endpoints and integrating the TTS service seamlessly into AAQ's core_backend component.

I have already developed a voice chat bot in python using gemini api, so it will be beneficial for your reference too.

lickem22 commented 2 months ago

Hello KANNAN B, Thank you for your interest in AAQ. To be part of this project, you can apply to Code4GovTech's Dedicated Mentoring Program here. However, you are encouraged to raise a PR in addition to your official proposal for us to review.

Best, Carlos Samey

On Sun, 14 Apr 2024 at 17:35, KANNAN B @.***> wrote:

Hello @amiraliemami https://github.com/amiraliemami @lickem22 https://github.com/lickem22,

I'm KANNAN B, a second-year B.Tech student majoring in Information Technology at Veltech Hightech Engineering College. I'm thrilled about the opportunity to contribute to your project, particularly in enhancing the Ask A Question (AAQ) chatbot with voice response capabilities. With my experience in backend development and AI/ML integration, I'm confident in my ability to assist in implementing the text-to-speech, speech-to-text, and in-house TTS service. I'm particularly excited about leveraging ESPnet-TTS for its versatility and alignment with the project's goals. I'm eager to dive into the technical aspects and contribute to the project's success. Looking forward to collaborating with your team!

Best regards, KANNAN B

— Reply to this email directly, view it on GitHub https://github.com/IDinsight/aaq-core/issues/128#issuecomment-2054082124, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKSGEEZEHHGJG4X7IMQE6OTY5KH43AVCNFSM6AAAAABEXRHN6WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJUGA4DEMJSGQ . You are receiving this because you were mentioned.Message ID: @.***>

lickem22 commented 2 months ago

Hello Mustafa, Thank you for your interest in AAQ. To be part of this project, you can apply to Code4GovTech's Dedicated Mentoring Program here. However, you are encouraged to raise a PR in addition to your official proposal for us to review.

Best, Carlos Samey

On Sat, 13 Apr 2024 at 20:38, Mustafa Akolawala @.***> wrote:

hey! @suzinyou @amiraliemami @lickem22

I just wanted to know is there any way I can contribute to this project before C4GT starts? cause I am really inclined to work towards this project as soon as possible

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

lickem22 commented 2 months ago

Hello Ashutosh, Thank you for your interest in AAQ. To be part of this project, you can apply to Code4GovTech's Dedicated Mentoring Program here. However, you are encouraged to raise a PR in addition to your official proposal for us to review.

Best, Carlos Samey

On Sat, 13 Apr 2024 at 15:01, ashuashutosh2211 @.***> wrote:

Hello @suzinyou https://github.com/suzinyou, I'm Ashutosh, a prefinal year student at IIT Jodhpur, specializing in Artificial Intelligence and Data Science. I have a strong proficiency in programming languages such as Python and C++. My experience includes working on diverse projects spanning machine learning and deep learning, including endeavors like Stock Price Prediction and Speech-to-Text Transcription. In addition to my programming skills, I have hands-on experience with various databases such as SQL (MySQL), Document-Oriented Databases (MongoDB), and Graph Databases (Neo4j). One notable project where I applied these skills is the development of a Video Search Engine.

I'm keenly interested in contributing to projects within your domain.

— Reply to this email directly, view it on GitHub https://github.com/IDinsight/aaq-core/issues/128#issuecomment-2053626249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKSGEE4XNLOML56CQMJAOQLY5ENBFAVCNFSM6AAAAABEXRHN6WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTGYZDMMRUHE . You are receiving this because you were mentioned.Message ID: @.***>

AbhimanyuSamagra commented 2 months ago

Do not ask process related questions about how to apply and who to contact in the above ticket. The only questions allowed are about technical aspects of the project itself. If you want help with the process, you can refer instructions listed on Unstop and any further queries can be taken up on our Discord channel titled DMP queries. Here's a Video Tutorial on how to submit a proposal for a project.

Sunilstar-V commented 2 months ago

Hello @lickem22 @amiraliemami, I'm happy to contribute to this project as I have already done many projects on voice recognition and TTS (text-to-speech) libraries. I have also designed a voice assistant that helps in resolving user queries for their laptop like opening apps, browsing and etc. This project is a bit similar to what I did. Also, I think I'm really good at making AAQ applications. Based on my previous experience I think I'm good for this opportunity. I look forward to making this API useful for the organizations. If you're okay to continue please assign me so that I can discuss it in detail.

LuciferMorningstar33 commented 2 months ago

Hello @suzinyou , I am thrilled to have the opportunity to work on the Ask A Question (AAQ) project under your mentorship. As a python developer(Devops) student with a passion for leveraging technology to address social challenges, I believe I bring a unique blend of skills and enthusiasm to the table.

Firstly, my academic background in cloud computing has equipped me with a solid understanding of AWS services, which will be crucial for integrating the voice response API into AAQ's infrastructure on AWS. I am confident in my ability to navigate AWS environments efficiently and effectively.

Moreover, my proficiency in Python aligns well with the project's tech stack, particularly in developing APIs using FastAPI and working with PostgreSQL databases. I have hands-on experience in building backend systems, which will be invaluable for implementing the API endpoints and integrating the TTS service seamlessly into AAQ's core_backend component.

nitish1804 commented 2 months ago

Hello @suzinyou ! I am thrilled to have the opportunity to work on the Ask A Question (AAQ) project under your mentorship. As a cloud computing student with a passion for leveraging technology to address social challenges, I believe I bring a unique blend of skills and enthusiasm to the table.

Firstly, my academic background in cloud computing has equipped me with a solid understanding of AWS services, which will be crucial for integrating the voice response API into AAQ's infrastructure on AWS. I am confident in my ability to navigate AWS environments efficiently and effectively.

Moreover, my proficiency in Database aligns well with the project's and working with PostgreSQL databases. I have hands-on experience in building backend systems, which will be invaluable for implementing the API endpoints and integrating the TTS service seamlessly into AAQ's core_ backend component.

lickem22 commented 2 months ago

Hello @Sunilstar-V , Thank you for your interest in AAQ. To be part of this project, you can apply to Code4GovTech's Dedicated Mentoring Program here. However, you are encouraged to raise a PR in addition to your official proposal for us to review.

lickem22 commented 2 months ago

Hello @nitish1804 , Thank you for your interest in AAQ. To be part of this project, you can apply to Code4GovTech's Dedicated Mentoring Program here. However, you are encouraged to raise a PR in addition to your official proposal for us to review.

lickem22 commented 2 months ago

Hello @LuciferMorningstar33 , Thank you for your interest in AAQ. To be part of this project, you can apply to Code4GovTech's Dedicated Mentoring Program here. However, you are encouraged to raise a PR in addition to your official proposal for us to review.

ThunderSmoker commented 2 months ago

Hello @suzinyou , I am Aradhya Pitlawar, a Third Year Undergraduate studying Computer Science and Engineering at Walchand College of Engineering Sangli. I am a proficient python developer and have a lot of experience in developing APIs. I have previously worked in text-to-speech functionality for an internship in a startup and have a proficient experienc in it. Moreover, you can check out my project GIDEON in which i made a rule based C++ Desktop assistant in which i used eSpeak module for text-to-speech and speech-to-text function

I beleive that i can successfully complete this project: 1.first milestone - USing gTTs python library due to its latest stable releases and good community support because of Google's support.

  1. Second Milestone - We can use ESPnet-TTS or espeak module it self depending on how customized responses we want.

I assure you the completion of this project by me through C4GT , if i get selected !

lickem22 commented 2 weeks ago

Weekly Goals

Week 1

Week 2

Week 3

Week 4

MustafaAkolawala commented 2 weeks ago

Weekly Learnings & Updates

Week 1

Week 2