dank074 / Discord-video-stream

Experiment for making video streaming work for discord selfbots.
185 stars 37 forks source link

Implement Audio and Video End-to-End Encryption (DAVE) Protocol #102

Open dank074 opened 2 months ago

dank074 commented 2 months ago

Discord announcement:

Last year, we announced that we were experimenting with new encryption protocols and technologies for audio and video calls on Discord. After extensive experimenting, designing, developing, and auditing, we’re excited to announce Discord’s audio and video end-to-end encryption (“E2EE A/V” or “E2EE” for short), which we like to refer to as our DAVE protocol. Developer Impact Starting September 2024, Discord is migrating voice and video in DMs, Group DMs, voice channels, and Go Live streams to use end-to-end encryption (E2EE).

Who this affects: Any libraries or apps that support Discord Voice connections.

You are not immediately required to support the E2EE protocol, as calls will automatically upgrade/downgrade to/from E2EE depending on the support of clients in the call. Implementing E2EE Voice We have added high-level documentation for Discord's Audio and Video End-to-End Encryption (DAVE) protocol, and the new voice gateway opcodes required to support it.

The most thorough documentation on the DAVE protocol is found in the Protocol Whitepaper. You can also use our open-source library libdave to assist with your implementation. The exact format of the DAVE protocol opcodes is detailed in the Voice Gateway Opcodes section of the protocol whitepaper. Future Deprecation and Discontinuation of Non-E2EE Voice Non-E2EE connections to voice in DMs, Group DMs, voice channels, and Go Live streams will eventually be deprecated and discontinued.

In 2025, all official Discord clients will support the protocol and it will be an enforced requirement to connect to the end-to-end encryption-eligible audio/video session types listed above.

Once a timeline for deprecation and discontinuation is finalized, we will share details and developers will have at least six months to implement before we sunset non-E2EE voice connections.

Read more about Discord's Audio and Video End-to-End Encryption (DAVE) protocol: Discord Developer Docs Change Log Meet DAVE: Discord's New End-to-End Encryption for Audio & Video DAVE protocol whitepaper libdave open-source library on GitHub

This can be in the backburner for now since it looks like they won't force it until sometime in 2025. Interesting that Stage Channel voice connections aren't mentioned in the encryption-eligible audio/video session types. I guess those are the only ones that won't support E2EE encryption

longnguyen2004 commented 1 month ago

From what I've seen, the encryption seems to be between the frame/NALU splitter and packetizer, so theoretically nothing needs to be changed in the packetizer (thank god). That introduces another layer of coupling between the WebSocket side and media handling side though, which could become ugly...

If a start code sequence is encountered the nonce is incremented and encryption is re-attempted. This process can repeat up to 10 times until a start code sequence is not encountered in the ciphertext and supplemental protocol data. It must be impossible for a start code to consistently appear in the protocol supplemental data section. In the unlikely event that the maximum number of attempts is reached the frame is dropped and a failure is returned.

This looks very scuffed...let's see how it goes.

longnguyen2004 commented 1 month ago

Unfortunately we don't have much option here...

dank074 commented 1 month ago

OpenMLS is working on adding WASM bindings https://github.com/openmls/openmls/pull/1525

It will be interesting to see what Discord.js does, since they'll also need a TS solution for their voice package

longnguyen2004 commented 4 weeks ago

From the changelog, we'll have a 6 month deprecation period to implement the protocol. Let's hope either OpenMLS or mls-rs will have an npm package by then

See https://discord.com/developers/docs/change-log#future-deprecation-and-discontinuation-of-none2ee-voice

DataM0del commented 4 weeks ago

@dank074 Discord.JS implemented DAVE a while ago. Useful references:

longnguyen2004 commented 4 weeks ago

That's not DAVE, just regular transport security that we're already doing

DataM0del commented 4 weeks ago

Oh....

DataM0del commented 4 weeks ago

That's not DAVE, just regular transport security that we're already doing

Anyways, there IS a library for DAVE, and it's provided by Discord. Not sure what's stopping you from using it :|

https://github.com/discord/libdave/tree/main/js

DataM0del commented 4 weeks ago

Well, it's not published but...

longnguyen2004 commented 4 weeks ago
  1. It uses native code (C++), and we don't want to have to maintain a native package ourselves (Node native modules are hard)
  2. The code is designed to work together with the voice module in the Discord app, and wouldn't be compatible with this library (or at least would require extensive modifications)

The plan now (for me at least) is to wait for OpenMLS or mls-rs to have an official npm package that we can use. They're both Rust libraries that can be compiled to WASM relatively easily.

DataM0del commented 4 weeks ago
  1. It uses native code (C++), and we don't want to have to maintain a native package ourselves (Node native modules are hard)

    1. The code is designed to work together with the voice module in the Discord app, and wouldn't be compatible with this library (or at least would require extensive modifications)

The plan now (for me at least) is to wait for OpenMLS or mls-rs to have an official npm package that we can use. They're both Rust libraries that can be compiled to WASM relatively easily.

@longnguyen2004 No? I linked to the js version of the library, not the C/C++ version of the library. It doesn't link to any native libraries. I went through every file in the JS version, doesn't look like it's linking to anything, just base64-js & @noble/hashes.

longnguyen2004 commented 4 weeks ago

Those are utilities functions only, it doesn't contain any actual encryption functions, all of that is done in native code (trust me, I've read through them all)