adorosario / openai-realtime-with-customgpt-poc

POC Using OpenAI Realtime API with CustomGPT for RAG And Twilio Voice
MIT License
7 stars 5 forks source link

Realtime Voice RAG (via Browser) #3

Open adorosario opened 1 day ago

adorosario commented 1 day ago

The goal is to see if this real-time Voice RAG can also be supported via browser -- so basically, the voice interaction that is currently being done via Twilio phone, should be supportable via browser.

USE CASES

  1. People on smartphones and ipads/kiosks, can have a full voice interactive experience.

TECH CONSIDERATIONS

  1. Check if advanced voice mode (similar to our call center) can be done from web browser ? So can a person visit a web page and have this voice experience ? (similar to ChatGPT app's advanced voice mode)

  2. This would need to work in any browser. What I am thinking is : If we give our customers a deploy option (like we do with sharable_link and embed widget) -- then they can deploy it on an iPad or as any website button to support audio talk (e.g. a button that can work in any browser)

NEXT STEPS

  1. POC for the Web deploy option.
nikhil-swamix commented 5 hours ago

rag can implemented in browser without hitting, backend server. (JS vector search). for voice you will have Twilio, or opensource webrtc with whisper model. our agency can do it in 3 weeks for production ready. Swamix global AI solutions GST Registered.

image

ready to start today. to lower realtime latency, cost and accuracy, some tradeoff will be there depending on industry domain , to be discussed. contact here