When the user hits the record button, show the text in the UI. We can show it within the input field, and disable the input field while user is talking, much like in this demo.
The backend already gets the speech-to-text result, and sends that off to eventually get a response from openai. We just need to show that text in the UI.
The record button currently records for 4 seconds, then we get a whole text result (unlike the above demo that streams the text piece by piece). For now, we can just show the whole text result at once, to start off with. Then, after
5
we will likely have the ability to stream the text as the user talks to make it more realtime.
When the user hits the record button, show the text in the UI. We can show it within the input field, and disable the input field while user is talking, much like in this demo.
The backend already gets the speech-to-text result, and sends that off to eventually get a response from openai. We just need to show that text in the UI.
The record button currently records for 4 seconds, then we get a whole text result (unlike the above demo that streams the text piece by piece). For now, we can just show the whole text result at once, to start off with. Then, after
5
we will likely have the ability to stream the text as the user talks to make it more realtime.