I am building a Discord bot that processes and outputs audio in real-time. I take a speaking user's input stream, process it through a WebSocket, and then send the output stream to be played by the Discord bot. However, I'm running into a bug where if I stop speaking, the bot stops outputting audio. Since there is a lag between when I say something and when I receive it from the WebSocket, my desired output never plays fully.
As an example of where I am right now:
user speaks --> input stream is sent to WebSocket --> bot successfully receives processed packets from the WebSocket --> audio output begins to play --> user stops speaking --> audio output immediately ends
When the user continues to speak after that, the bot continues to output from where it left off (so as an example: if the bot is supposed to say "hey what's up" and I stop speaking when the bot says "what's", then when I start speaking again it will make sure to say "up" before continuing).
I can confirm a few things from my testing:
-The input audio stream is never destroyed
-The speaking user has one WebSocket connection established, and it is never closed until the user leaves the channel.
-The WebSocket is receiving all of the user’s input data (because when you stop taking and start again, the bot picks up the translated audio from where it left off, implying that everything the user has said has been processed and sent back in some way).
-I notice that my output queue for playing audio goes empty when I stop speaking. When I continue speaking it is populated again with the correct stream (the left over from the previous output).
When the user starts speaking I subscribe like so:
Which package is this bug report for?
voice
Issue description
Hi everyone, thank you for your time.
I am building a Discord bot that processes and outputs audio in real-time. I take a speaking user's input stream, process it through a WebSocket, and then send the output stream to be played by the Discord bot. However, I'm running into a bug where if I stop speaking, the bot stops outputting audio. Since there is a lag between when I say something and when I receive it from the WebSocket, my desired output never plays fully.
As an example of where I am right now:
user speaks --> input stream is sent to WebSocket --> bot successfully receives processed packets from the WebSocket --> audio output begins to play --> user stops speaking --> audio output immediately ends
When the user continues to speak after that, the bot continues to output from where it left off (so as an example: if the bot is supposed to say "hey what's up" and I stop speaking when the bot says "what's", then when I start speaking again it will make sure to say "up" before continuing).
I can confirm a few things from my testing:
-The input audio stream is never destroyed
-The speaking user has one WebSocket connection established, and it is never closed until the user leaves the channel.
-The WebSocket is receiving all of the user’s input data (because when you stop taking and start again, the bot picks up the translated audio from where it left off, implying that everything the user has said has been processed and sent back in some way).
-I notice that my output queue for playing audio goes empty when I stop speaking. When I continue speaking it is populated again with the correct stream (the left over from the previous output).
When the user starts speaking I subscribe like so:
Here is my function for processing the audio stream:
My AudioQueue handling looks like this:
Thank you for your time, and forgive me if I have missed something as I am new to both Discord and Javascript.
Code sample
No response
Versions
-discord.js: 14.15.3 -node: v22.2.0 -OS: Mac Ventura 13.3
Issue priority
Medium (should be fixed soon)
Which partials do you have configured?
Not applicable
Which gateway intents are you subscribing to?
Guilds, GuildMembers, GuildVoiceStates, GuildMessages, DirectMessages
I have tested this issue on a development release
No response