AllYourBot / hostedgpt

An open version of ChatGPT you can host anywhere or run locally.
MIT License
304 stars 122 forks source link

Finish polishing up Voice mode so the feature can default to on #331

Open krschacht opened 2 months ago

krschacht commented 2 months ago
lumpidu commented 2 months ago

maybe you could add a model like Silero VAD ?

krschacht commented 2 months ago

@lumpidu That one is new to me. Thanks for the tip. Btw, if you want to try it out my branch is working pretty well. I’m just going to add automated tests and do a little more polish before merging in.

lumpidu commented 2 months ago

@krschacht you could integrate the model into either the backend via https://github.com/ankane/onnxruntime-ruby, or even into the frontend: https://onnxruntime.ai/docs/api/js/index.html, demo for browser: https://github.com/ricky0123/vad. I will definitely try out your project !

krschacht commented 2 months ago

@lumpidu This is really cool. I was not aware of client-side models like this for voice detection that could be run in this way. I wonder if it's using the new WebAssembly under the hood.

I don't think I'd prioritize this in the near-term. In case you haven't seen, yesterday I merged in a v1 of the voice mode: https://github.com/allyourbot/hostedgpt/discussions/348

I just updated my "voice polish" to-do list at the top of this task based on where I left off yesterday. But one notable thing is that OpenAI just announced that they have this incredible new voice model which is going to be released "soon". I'm not sure if soon is a couple weeks or a couple months :) but I will probably, intentionally, defer some of these tasks until after I can evaluate that. However, I'm using this voice mode daily now myself so I'm going to keep polishing it so that I can enjoy using it while I wait.

If you're interested in helping with any of this, let me know! I can suggest good tasks, and I can help you ramp up on the implementation. I welcome help! :)