Closed winstxnhdw closed 9 months ago
Discord's voice infra requires you to send an event over the gateway, then receive two back before you can actually connect to the voice server and start communicating. That would require a persistent gateway connection which doesn't sound serverless.
Agreed, you need a persistent gateway connection to connect to a voice server in the first place. This appears to be incompatible with webhook-style Interactions, and I don't think you can access your voice session ID and token via REST API. That being said, you could probably shoehorn a time-limited voice connection into a Lambda.
The next
branch makes the 'AWS Lambda' part a little more technically feasible by removing the use of the ffmpeg binary, and opus is never called as a binary so that's a non-issue. yt-dlp
or youtube-dl
are still essential in turning most user-given links into a usable audio URL. I've never run AWS Lambdas, but you might be able to move yt-dlp
in via a container image. The main network limitation is that it disallows inbound network connections, which shouldn't break songbird voice connections. If you write a program which just creates a Driver
(using all fields in ConnectionInfo
) and takes a list of URLs, that should work. You will likely need a VPS-style bot to do that, though -- if you prod around the REST API and can access voice-state and voice-server somehow, then feel free to update us.
I am thinking of a VPS-style bot that wakes up through Discord interactions and lives for 15 minutes at most.
Discord's voice infra requires you to send an event over the gateway, then receive two back before you can actually connect to the voice server and start communicating. That would require a persistent gateway connection which doesn't sound serverless.
If this is over REST, Lambda can definitely block the main thread and wait for the response before connecting to the voice server.
I've never run AWS Lambdas, but you might be able to move yt-dlp in via a container image. The main network limitation is that it disallows inbound network connections, which shouldn't break songbird voice connections.
I will look into this but I suspect that this may be difficult because a Rust function has to reside in their special Amazon Linux 2 containers.
The main network limitation is that it disallows inbound network connections, which shouldn't break songbird voice connections.
This should not be a problem. There is an option to block all concurrent requests to the Lambda function during its execution.
If this is over REST
The main problem is that, at least using docs, this is not over REST and requires a full websocket, which will stop webhook Interactions from being fired as far as I can tell.
The main problem is that, at least using docs, this is not over REST and requires a full websocket, which will stop webhook Interactions from being fired as far as I can tell.
I could use two Lambda functions. One to handle the webhook interactions and the other to maintain the websocket.
If I recall correctly, webhook interactions aren't sent if there is an active websocket connection, but that might have changed
If I recall correctly, webhook interactions aren't sent if there is an active websocket connection, but that might have changed
If that's the case, maybe setting up a second 'helper' bot that runs on a separate Lambda instance might help.
I think this is/was an interesting thought experiment, but is probably beyond what we want to work on.
Not too long ago, Discord released their Interactions API, which led to the birth of serverless Discord bots for performing simple tasks. This allowed developers to offload the maintenance cost of running a home server or a VPS to AWS's generous free tier.
I would like to take a step further and integrate songbird into my AWS Lambda function, however, the library depends on
opus
,ffmpeg
andyoutube-dl
. Is it possible for songbird to ship a binary with the above dependencies. If no, do you know of a direction I can look at to do it myself?AWS Lambda functions have a maximum compute time of 15 minutes. That's enough to play an average of 5 songs in a single execution. Implementing this feature would drastically reduce the cost of owning a Discord bot, and for most private servers, the maintenance cost would be nothing.