jitsi / jitsi-videobridge

Jitsi Videobridge is a WebRTC compatible video router or SFU that lets build highly scalable video conferencing infrastructure (i.e., up to hundreds of conferences per server).
https://jitsi.org/jitsi-videobridge
Apache License 2.0
2.91k stars 992 forks source link

Question about the SFU functionality #651

Open Metsovone opened 6 years ago

Metsovone commented 6 years ago

Is there a way to externally control the selective forwarding functionality? i.e. which streams are forwarded to which clients via XMPP or something similar?

I know there is the last-N functionality, but what we are trying to achieve is to have a controller application that communicates with the videobridge via XMPP and the focus user, and which tells the videobridge which streams to forward to whom.

So, if for instance, you have Alice, Bob, Carol and Daniel connected, Alice might be receiving Bob's and Carol's stream, Bob will receive Alice's only, Carol might see Alice and Daniel, etc.

The main point of this is to be able to: a. Quickly switch which streams a specific client is receiving b. Have different streams forwarded to different clients

bbaldino commented 6 years ago

This isn't currently possible in the bridge. There is a 'follow me' mode in the jitsi meet web ui which lets the moderator select specific streams to pin to the main stage and propagates that choice to other participants in the call, but that's about as close to your use case as jitsi currently gets. I do think the explicit control you talk about would be interesting and I do think we'd be open to a PR which added that functionality.

Metsovone commented 6 years ago

We've been looking at how this functionality may be implemented. It would be good to get your feedback.

We're using our own UWP client prototype. Eventually we'll port this to JS. We have complete control over the Jingle and colibriClass messages, and over how the SDP is processed. Our goal is to have the JVB forward only a personalised subset of media streams from the moment the client is active, and to subsequently be able to change that subset during the course of the client session. We can control the message flow to the client, so it knows which streams to request.

In terms of the JVB code, it looks like what we're trying to achieve is to get BitrateController::prioritise to allocate only those streams. We're not concerned with Last-N, only the creation of an array of received streams for each EndPoint.

The strategies we're considering are:

  1. Using colibriChannel to set the selectedEndpoints property. We're not sure if this can even be done, but it can't be done at all until the WebRTC data channel is established, which is too late for our purposes, as we want the Endpoint to start up with a limited selection of media streams.
  2. Dynamically editing the Jingle/SDP, so that on session-initiate and source-add, we manipulate the SDP so it receives only the nominated streams (dynamically creating the appropriate or a:=ssrc elements). If we did this, we'd need to know that only selected media streams are being forwarded.
  3. Making a PR so that we can implement a new class of Jingle messages to control the function of the JVB to allocate the stream allocation in BitrateController.

Would any of these be preferable, or possible? It would be helpful to know any blockers in advance.

bbaldino commented 6 years ago

We've been looking at how this functionality may be implemented. It would be good to get your feedback.

We're using our own UWP client prototype. Eventually we'll port this to JS. We have complete control over the Jingle and colibriClass messages, and over how the SDP is processed. Our goal is to have the JVB forward only a personalised subset of media streams from the moment the client is active, and to subsequently be able to change that subset during the course of the client session. We can control the message flow to the client, so it knows which streams to request.

In terms of the JVB code, it looks like what we're trying to achieve is to get BitrateController::prioritise to allocate only those streams. We're not concerned with Last-N, only the creation of an array of received streams for each EndPoint.

Yes I do think this is the right track. Though you may need to use 'pinnedEndpoints' instead of selected (or even both)--I'd have to take a closer look to remind myself of the interactions/implications of using one vs the other.

The strategies we're considering are:

Using colibriChannel to set the selectedEndpoints property. We're not sure if this can even be done, but it can't be done at all until the WebRTC data channel is established, which is too late for our purposes, as we want the Endpoint to start up with a limited selection of media streams.

It's unfortunate this won't work, as this would allow it to be done without any changes at all, I believe. Could you just not render any video locally until the connection was established and you could issue the command? What is actually critical here: the receiver must not RECEIVE any data from a sender that it doesn't want to show? Or merely that it must not RENDER any data from a sender it doesn't want to show? Another option would be to in audio only, so no video streams were received, and then change to receive video after you were able to send this message?

Dynamically editing the Jingle/SDP, so that on session-initiate and source-add, we manipulate the SDP so it receives only the nominated streams (dynamically creating the appropriate or a:=ssrc elements). If we did this, we'd need to know that only selected media streams are being forwarded.

Doing this at the SDP/Jingle level would be tough and would definitely require something completely custom.

Making a PR so that we can implement a new class of Jingle messages to control the function of the JVB to allocate the stream allocation in BitrateController.

Again I think doing this over Jingle would not be desirable, as the datachannel/websocket control channel would be the desired use for this sort of stuff.

Would any of these be preferable, or possible? It would be helpful to know any blockers in advance.

Metsovone commented 6 years ago

After some thought, we're going to try the SDP editing at the client level, as it is important for us that clients do NOT RECEIVE all streams, rather than that they do not display/hear them.

The adding/removing sources dynamically seems to work, but we still haven't verified that removed sources actually stop being forwarded to clients. If it turns out they are still forwarded, we will have to revisit one of the other options.

To explain a bit further, here's a version of your scenario: There is a large number of people connected - where not everyone might have a very good connection They are split into groups, where they only see and hear the other members of the group. However a moderator should be able to do the following, via the controller application:

  1. Move people between groups and
  2. Broadcast one or more specific streams to everyone.

As you can tell, therefore, we don't want a client to receive all streams, when they only need to receive a small subset of those. We cannot use different conferences for each group either, as then sending some streams to everyone would be very complicated and so would moving people between groups.

If you have any feedback or advice, please let me know. Otherwise, I will update this issue once we know if our approach works, in case other people are interested in something similar.

jitsi-developers commented 6 years ago

On 25/04/2018 05:34, Metsovone wrote:

After some thought, we're going to try the SDP editing at the client level, as it is important for us that clients do NOT RECEIVE all streams, rather than that they do not display/hear them.

Note that client side SDP changes alone will NOT change the streams that the client receives.

Boris

Metsovone commented 6 years ago

Thank you. That's good to know.

However, that still leaves us with the question of the best way to proceed. Given that we will be using our own client and can also have separate communication between the clients, is there a built-in way to control what streams each client receives (we would like to have complete control over that)? Or is the only way to achieve that to delve into the code and then do a pull request?

bbaldino commented 6 years ago

@bgrozev could they set lastN to 0 to receive no video, and then manually pin the endpoints they want? That may work for video. But from your description it looks like you effectively want "breakout meetings" (totally separate conferences within a single conference)? If so, audio is going to be a bigger problem, since presumably you want that totally separate as well and we always blindly forward all audio to all participants.

Metsovone commented 6 years ago

Ah good idea. We did consider that but thought that maybe lastN limits the total number of streams that a client can receive. We haven't really studied the code well enough yet.

Your point about the audio is also valid, we'll have to think about that.

bbaldino commented 6 years ago

Yeah, I can't remember if pinned endpoints are allowed to override the last-n value or not. But, either way, if audio is a problem you're definitely looking at bigger changes. I think the idea of being able to do 'last-n' for audio would be interesting. And at that point, presumably 'pinning' a source could apply to both audio and video...then maybe you'd be able to get what you want with a combination of last-n and pinning manipulation.

soq2000 commented 6 years ago

Dear SFU hackers :-) I am also interested in such manipulation of a subset of streams will be sent to some predefined clients. Please keep me up-to-date with your progress. Concerning the answer of Boris related to SDP, as I understand, in the case of RTSP, if the SDP is modified (number of stream id reduced) then client can receive less stream (lack of information). Or here the streams are pushed to client, not at all in the manner of RTSP ? Thanks

bgrozev commented 6 years ago

Concerning the answer of Boris related to SDP, as I understand, in the case of RTSP, if the SDP is modified (number of stream id reduced) then client can receive less stream (lack of information). Or here the streams are pushed to client, not at all in the manner of RTSP ?

In WebRTC in general, and with the jitsi system in particular, manipulating the SDP on a receiver will have no direct effect on the sender. If you manipulate the SDP on the receiver side to not include certain streams, the client will not render them, but they will still go over the network.

a3c commented 4 years ago

@Metsovone were you able to achieve this functionality. We are looking to do something similar along .

Metsovone commented 4 years ago

@a3c Not really, although we've found a workaround to fit our use case. Among other things, we wanted to be able to have people split into teams with the coaches being able to visit the teams,

So, what we are doing is moving people between jitsi 'rooms' - our server app will tell clients which room to connect to. So, we might start off with 12 people and 2 presenters in one room, then move groups of 4 people into their own 'team' rooms and have the presenters move between those rooms. We are keeping the connection to jvb so the switch is quite fast.

We did consider at some point seeing if we could have clients be connected to two rooms (on the same client) at the same time, but didn't really investigate the option so not sure it would work. If it did though, you could do something a lot more complex which would still be a workaround

a3c commented 4 years ago

In WebRTC in general, and with the jitsi system in particular, manipulating the SDP on a receiver will have no direct effect on the sender. If you manipulate the SDP on the receiver side to not include certain streams, the client will not render them, but they will still go over the network.

@bgrozev what will be the way to handle audio streams as well. In our scenario for school classes, we have observed that users connecting through mobile gets disconnected frequently. So, we were thinking if we can reduce number of streams in mobile only to the teacher. What would you suggest to handle this scenario?