meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.22k stars 2.48k forks source link

nack counts all audio for videoroom plugin listener handles #258

Closed ploxiln closed 9 years ago

ploxiln commented 9 years ago

For the videoroom plugin, for listener handles, all nacks are counted as audio packet NACKs, even though most of them are video packet NACKs.

You can see these counts in janus admin (trimmed):

{
    "session_id": 3756909664,
    "handle_id": 373788390,
    "plugin": "janus.plugin.videoroom",
    "plugin_specific": {
        "type": "listener",
        "room": 1234,
        "feed_id": 361429649,
        "feed_display": "plo2",
        "destroyed": 0
    },
    "streams": [
        {
            "id": 1,
            "ready": -1,
            "disabled": "false",
            "ssrc": {
                "audio": 1267177442,
                "video": 3383906181
            },
            "components": [
                {
                    "id": 1,
                    "state": "ready",
                    "in_stats": {
                        "audio_bytes": 0,
                        "video_bytes": 0,
                        "data_bytes": 974,
                        "audio_nacks": 103,
                        "video_nacks": 0,
                        "audio_bytes_lastsec": 0,
                        "video_bytes_lastsec": 0
                    },
                    "out_stats": {
                        "audio_bytes": 1254718,
                        "video_bytes": 4051952,
                        "data_bytes": 1261,
                        "audio_nacks": 0,
                        "video_nacks": 0
                    }
                }
            ]
        }
    ]
}

publisher handles don't have this problem:

{
    "session_id": 414341886,
    "handle_id": 2733785551,
    "plugin": "janus.plugin.videoroom",
    "plugin_specific": {
        "type": "publisher",
        "room": 1234,
        "id": 361429649,
        "display": "plo2",
        "viewers": 1,
        "destroyed": 0
    },
    "streams": [
        {
            "id": 1,
            "ready": -1,
            "disabled": "false",
            "ssrc": {
                "audio": 4250683611,
                "video": 1177463177,
                "audio-peer": 2584759378,
                "video-peer": 3035943213
            },
            "components": [
                {
                    "id": 1,
                    "state": "ready",
                    "in_stats": {
                        "audio_bytes": 2549276,
                        "video_bytes": 8994731,
                        "data_bytes": 1734,
                        "audio_nacks": 0,
                        "video_nacks": 0,
                        "audio_bytes_lastsec": 4165,
                        "video_bytes_lastsec": 15634
                    },
                    "out_stats": {
                        "audio_bytes": 0,
                        "video_bytes": 0,
                        "data_bytes": 1409,
                        "audio_nacks": 54,
                        "video_nacks": 13
                    }
                }
            ]
        }
    ]
}

That was probably excessive quoting... just trying to be thorough... anyway, the problem in the code is here: https://github.com/meetecho/janus-gateway/blob/master/ice.c#L1421

                                /* Is this audio or video? */                                       
                                int video = 0;                                                      
                                if(!janus_flags_is_set(&handle->webrtc_flags, JANUS_ICE_HANDLE_WEBRTC_BUNDLE)) {
                                        /* Easy enough */                                           
                                        video = (stream->stream_id == handle->video_id ? 1 : 0);    
                                } else {                                                            
                                        /* TODO Bundled streams, check SSRC */                      
                                        guint32 rtcp_ssrc = janus_rtcp_get_sender_ssrc(buf, len);   
                                        video = (stream->video_ssrc_peer == rtcp_ssrc ? 1 : 0);     
                                        //~ JANUS_LOG(LOG_VERB, "[RTCP] Bundling: this is %s (video=%"SCNu64", audio=%"SCNu64", got
                                                //~ video ? "video" : "audio", stream->video_ssrc_peer, stream->audio_ssrc_peer, rt
                                }

For bundled streams, janus looks at the ssrc to categorize the stream. But the videoroom plugin re-writes SSRCs for viewers. I actually tried digging further into the rtcp packet, to find the RTCP-FB, and the "media" field within it, and comparing it to the stream->video_ssrc. Neither the outside rtcp ssrc, nor the inside rtcp-fb media ssrc, correspond to audio_ssrc, video_ssrc, audio_ssrc_peer, or video_ssrc_peer. This isn't surprising, because of the ssrc re-writing which the plugin does.

It looks like it will take a bit of extra plumbing or some other strategy entirely, if we really want to get these statistics to work for videoroom listener handles (and my custom plugin). Thoughts?

lminiero commented 9 years ago

The plugin can put whatever SSRC it wants, but this is then overwritten by the core with its own audio_ssrc or video_ssrc (and, in case of RTCP messages, the peer SSRCs too in the messages). This is needed to keep things coherent on the user's side, e.g., when switching source for a viewer in the streaming or videoroom plugin. Since the plugin overwrites those and the same SSRCs are advertized in the SDP, they are the only SSRCs both client and Janus are aware of. The detection usually works fine, since that same code branch leads to incoming_rtcp callbacks for plugins, which AFAICT are correctly marked as video for FIR and REMB packets, for instance.

Have you checked if that method always fails, instead, meaning I'm wrong there? I'll try and replicate this myself too to see what may be going wrong.

lminiero commented 9 years ago

Thinking about it, since you're experiencing this only with listeners, it may have something to do with the fact those PeerConnections are sendonly, as listeners don't send anything but only receive media from Janus. As such, they never actually send any packet, so the remote SSRC is only known to Janus if it is advertized in the SDP. If not, the remote video SSRC is always 0 and so the check will always fail. I'll have a look and in case change that check in order to have it look for either the local or remote SSRC in the packet.

lminiero commented 9 years ago

Just verified that that's the cause... if you look at the video m-line in a listener SDP answer, for instance:

a=mid:video
a=recvonly
a=rtcp-mux
a=rtpmap:100 VP8/90000
a=rtcp-fb:100 ccm fir
a=rtcp-fb:100 nack
a=rtcp-fb:100 nack pli
a=rtcp-fb:100 goog-remb

No SSRC is there, and in fact the check in line 1430 of ice.c confirms that both remote SSRCs are zero, as far as Janus is concerned. Handling incoming PLI, FIR, REMB and the like in the videoroom plugin works because there's no actual check on the video flag: methods like janus_rtcp_has_fir are used directly to inspect the payload, whatever it is supposed to refer to.

I'll fix the check we have to make it inspect the recipient SSRC instead of the sender one, as that's always known (set by Janus and always advertized).

lminiero commented 9 years ago

PS: your dumps from the admin API also confirm that, as the remote SSRCs are missing there... I should have noticed sooner :blush:

lminiero commented 9 years ago

Turns out that this was discussed, as it apparently is an issue in Chrome. You should always allocate SSRCs even when you're not going to send, but instead they always set 1 for recvonly streams (and in fact 1 is what you get out of a janus_rtcp_get_sender_ssrc):

https://groups.google.com/forum/#!topic/discuss-webrtc/5yuZjV7lkNc

Newer versions of Chrome will fix that. In the meantime I'll work on a workaround until that happens.

ploxiln commented 9 years ago

I do recall seeing the zeroes when I was briefly trying to fix this myself. Thanks for looking into it.

I just started a short vacation, so I won't be very responsive for a few days.

lminiero commented 9 years ago

I just prepared a patch, I'll commit it right now. I verified it doesn't seem to break any existing behaviour and also seems to fix the issue in the detection. I couldn't verify whether NACKs are correctly indexed as a consequence, but I see no reason why it shouldn't: I'll try with some remote servers instead of the local ones I'm using now. I'll keep the issue open until you have a chance to confirm it works for you.

Enjoy your vacation!

ploxiln commented 9 years ago

Now, for a view-only stream, all nacks are considered video nacks:

                    "in_stats": {
                        "audio_bytes": 0,
                        "video_bytes": 0,
                        "data_bytes": 907,
                        "audio_nacks": 0,
                        "video_nacks": 132,
                        "audio_bytes_lastsec": 0,
                        "video_bytes_lastsec": 0
                    },
                    "out_stats": {
                        "audio_bytes": 954805,
                        "video_bytes": 7916783,
                        "data_bytes": 1806,
                        "audio_nacks": 0,
                        "video_nacks": 0
                    }

This is true for view-only stream in both Chrome and Firefox. (Interesting note, Firefox sends many more NACKs for the same loss rate).

Audio/video NACK distinction still works for nacks sent by janus to stream producers:

                    "in_stats": {
                        "audio_bytes": 2477970,
                        "video_bytes": 22393924,
                        "data_bytes": 1584,
                        "audio_nacks": 0,
                        "video_nacks": 0,
                        "audio_bytes_lastsec": 4319,
                        "video_bytes_lastsec": 40826
                    },
                    "out_stats": {
                        "audio_bytes": 0,
                        "video_bytes": 0,
                        "data_bytes": 1343,
                        "audio_nacks": 102,
                        "video_nacks": 51
                    }
lminiero commented 9 years ago

May it be that just video packets are being dropped on the viewer side?

ploxiln commented 9 years ago

If packets are dropped randomly, it's pretty unlikely that hundreds are video, and zero are audio, even if there are 10 times as many video packets as audio packets. (it looks like there are 8 times as many bytes in the video stream as the audio stream, but nacks to the stream producer suggest there are actually more (smaller) audio packets than video packets)

ploxiln commented 9 years ago

I can confirm with Chrome graphs :)

                    "in_stats": {
                        "audio_bytes": 0,
                        "video_bytes": 0,
                        "data_bytes": 907,
                        "audio_nacks": 0,
                        "video_nacks": 26,
                        "audio_bytes_lastsec": 0,
                        "video_bytes_lastsec": 0
                    },
                    "out_stats": {
                        "audio_bytes": 388268,
                        "video_bytes": 2691348,
                        "data_bytes": 1806,
                        "audio_nacks": 0,
                        "video_nacks": 0
                    }

screen shot 2015-06-17 at 17 28 35

screen shot 2015-06-17 at 17 28 47

ploxiln commented 9 years ago

hmm maybe it just doesn't nack audio packets. no point getting them late I guess

ploxiln commented 9 years ago

yup... echotest confirms; chrome has a video "nacks sent" graph, but doesn't even have an audio "nacks sent" graph. Firefox doesn't expose as many stats conveniently but it looks like it does the same thing.

Also... I guess the fact that the original NACK logic in janus only nacked video packets makes sense, it's what browsers do. (But for my purposes I like that it now also nacks audio packets.)

lminiero commented 9 years ago

Yep, I think the idea is to eventually rely on FEC and the like for audio. Besides, Opus has some built in mechanism to cope with packet loss, which they're probably exploiting.