security-union / videocall-rs

teleconference system written in rust
MIT License
1.34k stars 114 forks source link

Thoughts about future communication protocol #140

Open ronen opened 10 months ago

ronen commented 10 months ago

Following up from #118, here are some thoughts about things to put into a future version of the protocol. Putting them out here for discussion...

Explicit "join" and "leave" messages for each participant:

Currently the app only realizes that there's a new participant because messages start coming in, and the app never knows when a participant leaves. Having explicit "join" and "leave" messages would make it easier to have apps handle participants' coming and going with a nice UX.

The "join" message could potentially include useful arbitrary metadata, such as the participant's full name, location, time zone, avatar, etc.

Arbitrary data sources or "channels"(better name?) for each participant:

Currently the code is hardwired for each participant to have a video channel, an audio channel, and a screen-share channel. And there's some awkwardness around screen-share not necessarily being there, and there's no cleanup when they're gone.

Instead, what if the protocol allowed arbitrary data channels, so that different applications could use different data channels? Some applications might not do screen sharing; Some applications might allow a user to do screen sharing for more than one window simultaneously; Some applications might allow a user to play a video; etc.

Each channel could have explicit "start", "pause", and "end" messages. The "start" message would include an identifier, the data type (audio/video/etc. see below :), and potentially other useful arbitrary metadata like codec, resolution, etc. The "pause" message would allow the app to have UX showing when remote peers have paused. The "end" message would allow the app to clean up.

As far as the protocol is concerned, the identifier could be an arbitrary string. Any given app would use whatever identifiers made sense to it. And the app would likewise know what metadata it was interested in sending/receiving.

Other data types for channels:

In addition to audio & video channels, could imagine supporting/allowing other channel data types. Such as:

Actually, does the protocol actually care what the data types are? Or is the data type really just an arbitrary string to be interpreted by the app? As long as an app knows how to serialize/encode the data into MediaPackets to send and how to parse/decode them when receiving it should be fine, I think? That said, the ecosystem would presumably include standard common data types with encoders/decoders for each.

...

What do you think?

darioalessandro commented 10 months ago

"Explicit "join" and "leave" messages for each participant:" this is a must!!

darioalessandro commented 10 months ago

files, text etc

I would love that, just today I needed text :(

darioalessandro commented 10 months ago

@ronen @griffobeid is pivoting to focus on this for the next week.

Can you guys connect? we can divy up the work here.

darioalessandro commented 10 months ago

We strongly feel that we are up to something big here!!