meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
7.99k stars 2.45k forks source link

[1.x] crack sounds on audiobridge, then publisher has drops on output stream #3297

Open spscream opened 7 months ago

spscream commented 7 months ago

What version of Janus is this happening on? 1.2.1 and master

Have you tested a more recent version of Janus too? tes, tested on master and on rnnoise branch

Was this working before? We are always has this issue, I reported to discourse group before.

Is there a gdb or libasan trace of the issue? no

Additional context I tested drops using clumsy utility. listeners start to hear the crack sounds starting from 5% of drops. I also tried to set expectedloss values for ab, but it doesn't help. On vr we don't hear the same cracks.

spscream commented 7 months ago

I tried to add log then fec is used(https://github.com/meetecho/janus-gateway/blob/master/src/plugins/janus_audiobridge.c#L8271), but it doesn't trigger any logs(only single occurence of it then participant starts stream)

spscream commented 7 months ago

I tried to add log then fec is used(https://github.com/meetecho/janus-gateway/blob/master/src/plugins/janus_audiobridge.c#L8271), but it doesn't trigger any logs(only single occurence of it then participant starts stream)

looks like mine logging issue, fec occurence is logged out after publisher has been destroyed

lminiero commented 7 months ago

I think you may be a bit confused as to how FEC works. There's FEC on the way in, and there's FEC on the way out.

Our code to process FEC on the way in (the line you reference in your comment) depends on whether or not the browser is actually adding any FEC to the packets; if they aren't, there won't be any FEC to take advantage of when decoding the media. Browsers won't send any FEC unless they receive reports via RTCP that a lot of packets are being lost (can't remember what the threshold is).

FEC on the way out is whether or not the AudioBridge adds FEC to outgoing packets, so when encoding Opus. That's what the expectedloss property is for: you're telling Janus you expect X% of packets to be lost in delivery, and so telling the plugin to add FEC when encoding media. It has nothing to do with FEC on the way in.

spscream commented 7 months ago

@lminiero thanks for explanation. We have checked this issue on demo(https://janus.conf.meetecho.com/audiobridgetest.html) and it repeats there. Steps:

These crack sounds don't appear on videoroom demo.

lminiero commented 7 months ago

That it doesn't appear in the VideoRoom is irrelevant: the VR only relays media, which means it's the browser that terminates audio doing its magic there. The AudioBridge has to terminate stuff. Again, if the browser is not sending FEC, then we can't use it. Try checking the webrtc-internals to see if FEC is actually being added to packets, and how much.

spscream commented 7 months ago

okey, we checked again if the browser sends fec. We can't see it in webrtc-internals of sender(it don't have such info), but we could see it on receiving side. I checked it on videoroom and browser(Chrome 119.0.6045.160) starts adding of fec right at the moment then drops are on.

So do you confirm that problem in audiobridge and fec? How can I help to debug it? Do you need any additional info?

lminiero commented 7 months ago

There's nothing I can do until browsers send FEC to the AB too: I seem to remember there was a graph for that on the sender side too. If it works with the VideoRoom, you should check if there are maybe differences in the negotiation, in the RTCP feedback that is exchanged, or in the loss reported (which you can inspect via Admin API, for instance).

spscream commented 7 months ago

Checked it, negotiation of vr and ab looks the same - flag useinbandfec=1 exists in both cases in offers and in answers. Janus admin api shows in-link-quality accordingly to percentages of drops and also rtcp reports exchanged correctly.

atoppi commented 7 months ago

@spscream when enabling FEC in traffic from Audiobridge room to participants, don't forget to also set the default_bitrate parameter for that room (e.g. 64000). If the libopus encoder does not have enough "room" for adding FEC packets, it will just skip sending those.

So please try with a large enough value (e.g. 64000) and bare in mind that doing so will increase the bitrate towards your clients, regardless the current loss is.

In order to check if a Chromium browser is receiving FEC packets from the audiobridge, open up the webrtc-internals and head to inbound-rtp section, you will find the fecPacketsReceived metric.

spscream commented 7 months ago

I tried to set fec=true and expected_loss=20 for all participants and set default_expectedloss=20 and default_bitrate=64000 for room defaults. fecPacketsReceived graph on recipients is on zero and doesn't show anything. I don't really think problem is in fec.

atoppi commented 7 months ago

If fecPacketsReceived is 0 in inbound-rtp coming from janus audiobridge then FEC is not being used at all

spscream commented 7 months ago

We got incoming fec to work - set default_bitrate to 128000 and fecPacketsReceived now > 0, but crack noise stays the same.

I think this issue isn't fec related at all and it is related to how the decode/encode process is done in ab, since we don't hear the same cracks for VR(we can hear subtle interrupts if losses rate is high, but without cracks).

atoppi commented 7 months ago

I tested a 16khz room, with 64kbps bitrate and 5% of expected loss. Everything seems to work, even in 10% loss scenarios the audio is still fine and perfectly audible.

spscream commented 7 months ago

we use 48khz now, can it be the problem, we will check on 16khz? Do you add losses on publisher out or on subscriber in?

atoppi commented 7 months ago

I add losses on both links. I'd say checking with 16khz is worth a trying.

spscream commented 7 months ago

how do you change sampling rate to 16khz? Is it enough to change it on ab create with sampling_rate parameter to 16000? I see opus/48000/2 is still in sdp offer/answer despite the sampling_rate value.

btw. we still have crack sounds even with 16000 setting...

atoppi commented 7 months ago

Just set sampling_rate in room configuration. The SDP refers to RTP clock rate and not to to codec sampling rate. It will always be opus/48000/2, regardless of actual encoding parameters.

On Wed, Dec 6, 2023, 14:43 Alexander Malaev @.***> wrote:

how do you change sampling rate to 16khz? Is it enough to change it on ab create with sampling_rate parameter to 16000? I see opus/48000/2 is still in sdp offer/answer despite the sampling_rate value.

— Reply to this email directly, view it on GitHub https://github.com/meetecho/janus-gateway/issues/3297#issuecomment-1842914667, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZ7KBKPEQCNVAAKVPI6JJDYIBY6JAVCNFSM6AAAAAA74HOHCGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBSHEYTINRWG4 . You are receiving this because you commented.Message ID: @.***>

brave44 commented 7 months ago

@atoppi totalSamplesReceived/s in webrtc-internals shows room's sampling rate or rtp clock rate? We have 48000 even if we set sampling_rate in room configuration. Is it normal?

lminiero commented 7 months ago

Is it normal?

Yes.

brave44 commented 6 months ago

@atoppi we tried 16000 sampling rate, with 64kbps bitrate - still have this problem. The problem is reproduced on dev enviroment with 5/10% loss, and in production - we hear cracks, and it usually people with not ideal connection. I also noticed more cracks when somebody just turns mic on, and start talking. Over time, cracks reduced a little, but still there.

Any ideas how can we debug problem further?

atoppi commented 6 months ago

I don't know which kind of losses you are experimenting with, however bear in mind that OPUS inband fec is able to recover just a single packet (e.g. a single miss in a sequence), so if you have problematic links with burst of losses, in band fec would not help.

I can not do much more without a reproducible case.

spscream commented 6 months ago

hi, i able to figure out what cracks appears than more than 1 packet lost. I made some changes to check if it will fix the issue: https://github.com/meetecho/janus-gateway/commit/905e6e5f76bd0995fcb24cc83d2229d14fc9d2e7

For me it totally fixes cracks even on 70% of losses(voice is robovoiced, but without cracks), only case than crack appears again is then queue-in drops packets over QUEUE_IN_MAX_PACKETS, so I increased it to 20, but maybe it should be fixed other way.

lminiero commented 6 months ago

@spscream ok, so you're basically enabling PLC (packet loss concealment), which might indeed be helpful. I see several things that look broken/weird to me, though, especially in the fact that it seems like you're breaking FEC (plus a few other things). Could you submit this as a PR, so that it can be discussed and reviewed there, in order to figure out how to properly incorporate PLC in the code?

spscream commented 6 months ago

@lminiero @atoppi I updated some code and made a pr. I appreciate any feedback.