nicolamassarenti / meta-assistant

An MVP that uses Google STT, OpenAI LLM, Nvidia Audio2Face

62 stars 12 forks source link

[Feature] - Record audio with streaming approach #1

Open nicolamassarenti opened 1 year ago

nicolamassarenti commented 1 year ago

Description

Audio is currently recorded until one of these two options are true:

The length is MAX seconds
The user pressed SPECIAL_KEY + ENTER

The desired behavior is that the audio is recorded in a streaming fashion

AsyncSan commented 1 year ago

How would you handle multiple calls one after another? If there would be background voices or the user would has a "too long" thinnking pause before talking again, wouldn´t it overlap the messages and both, the assistand and the user would loose focus what is even replied to?

nicolamassarenti commented 1 year ago

Hey @AsyncSan ,

meta-assistant with streaming capabilities could handle a more linear conversation, characterized by less thiking pauses of the assistant. It's also true that the current implementation requires the user and the assistant to sync.

For what concern the background voices, there would be the risk indeed, however using a good Speech-to-Text model could definitely smoothen the impact of noise.

Let me know if this helps