client-side conversation flow logic

daily-demos / llm-talk

Talk to GPT-4 and create a story together.

BSD 2-Clause "Simplified" License

80 stars 17 forks source link

Hi! In the writeup on this app I saw that it says "On the client side, we're using JavaScript audio APIs to monitor the input level of the microphone to determine when the user has started and stopped talking". But I looked through the code base and I can't find anything like that.

I also checked to see if the speaker object is being evaluated for audio levels, but it seems like that isn't the case either.

As far as I can tell, there are two conditions where there is an endpoint:

re.search(r'[\.\!\?]$', self.transcription) matches
It has been 5 seconds since the last fragment.

Can you confirm that I'm not missing anything? If there is another demo somewhere that shows an example of monitoring input level, that would be very helpful. Thanks.

daily-demos / llm-talk

client-side conversation flow logic #12