meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.13k stars 2.47k forks source link

[1.x] databuffermsg/buffermsg not working for streaming plugin #3411

Closed ares1977 closed 3 days ago

ares1977 commented 1 month ago

What version of Janus is this happening on? 1.2.3

Have you tested a more recent version of Janus too? No, currently this is the latest version.

Was this working before? I do not know

Is there a gdb or libasan trace of the issue? No

Additional context Last message is never sent, even when enabling buffermsg on data stream.

I believe this bug is has two different causes:

https://github.com/meetecho/janus-gateway/blob/0d1da91daebb1afcc500b4fff63bef8fb846dfbf/src/plugins/janus_streaming.c#L2182-L2187

I think buffermsg container should be m instead of cat. This is not a big issue because there is a workaround: to set buffermsg = true in the root of the mountpoint in the configuration file (dataype is also affected by the same problem, and it should be type instead)

https://github.com/meetecho/janus-gateway/blob/0d1da91daebb1afcc500b4fff63bef8fb846dfbf/src/plugins/janus_streaming.c#L9949-L9961

pkt is created and filled with the contents of the last relay packet, but then it is not used for anything else (causing, I believe, a memory leak) I suppose the idea was to copy the data in last_msg (after freeing the previous last_msg). Finally, I think that packet in the lines 9955 to 9958 should be pkt

atoppi commented 1 month ago

@ares1977 thanks for pointing this out, please try the patch above.

dataype is also affected by the same problem, and it should be type instead)

I think that would collide with the type of the media element (audio, video, data).

ares1977 commented 1 month ago

I think that would collide with the type of the media element (audio, video, data).

Ah, yes, you are right.

ares1977 commented 1 month ago

@atoppi I tested it and it seems more changes are needed to fix this issue.

https://github.com/meetecho/janus-gateway/blob/0d1da91daebb1afcc500b4fff63bef8fb846dfbf/src/plugins/janus_streaming.c#L5590-L5598

That piece of code, that handles the delivery of last_msg, is executed in the callback janus_streaming_setup_media, which is too early, because the data channel is not ready yet. I suppose the best place is in the callback janus_streaming_data_ready (right after setting session->dataready)

https://github.com/meetecho/janus-gateway/blob/0d1da91daebb1afcc500b4fff63bef8fb846dfbf/src/plugins/janus_streaming.c#L10080-L10092

Then we have a problem inside the function janus_streaming_relay_rtp_packet. The data package is not a keyframe, so is going to leave the function in line 10091. I think a !packet->is_data condition should be added.

lminiero commented 1 month ago

I wouldn't rely on buffermsg too much. We added the buffering of keyframes ages ago as an experiment, and it never worked as expected. Decoders don't like receiving a full frame, and then diffs that come after a ton of missing diffs (since those are not buffered). The end result was a frozen frame for many seconds until the next one. As such, I'm not ruling out getting rid of that functionality entirely, since it's a pain to mantain for little to no benefit.

ares1977 commented 1 month ago

@lminiero but buffermsg is unrelated to keyframes, isn't it? IMHO, it looks like saving the last data message and sending it to the connecting peers is not as problematic and convoluted as the code handling keyframes.

lminiero commented 1 month ago

Sorry, right, I was thinking of bufferkf, so this was me being dumb! :grin:

atoppi commented 1 month ago

@ares1977 we updated the PR. Can you test it please? I'd like to hear some feedback from @lminiero too, since it touched some critical parts of the plugin.

ares1977 commented 1 month ago

@atoppi I am not getting data messages anymore, plus I got a segmentation fault. I am going to debug it later on, but in the meantime I had a look at the commit and I think the issues may be related to this cast:

https://github.com/meetecho/janus-gateway/blob/eed71dcb2da1aa616edb27dfb0057ebb89ab2a25/src/plugins/janus_streaming.c#L5682

should it not be cast to janus_streaming_session_stream and then the member stream cast to janus_streaming_rtp_source_stream?

atoppi commented 1 month ago

@ares1977

should it not be cast to janus_streaming_session_stream and then the member stream cast to janus_streaming_rtp_source_stream?

Thanks, fixed.

ares1977 commented 1 month ago

@atoppi I am getting the last message! Unfortunately I am getting it multiple times. A constant stream of packages containing the last message. So this issue sure has something to do with what you said in the PR:

I'm not sure if that callback might be issued multiple times, in particular when multistream and renegotiation come into play.

I thought about it and I tested the value of &session->dataready before it is changed in line 5676. It looks like it is equal to 0 when the data channel is created and it is equal to 1 in further executions. Could you use that as a filter? I did not run very extensive tests, so I am not sure if there may be more cases where &session->dataready is equal to zero.

atoppi commented 1 month ago

It should work by putting the sending routine inside the compare and exchange block of the data ready callback.

atoppi commented 1 month ago

@ares1977 I have updated the PR to send the buffered message only the first time data_ready is called. However I'm not sure about that, maybe there exist scenarios when you need the sending of the buffered message again on the same PC.

ares1977 commented 1 month ago

@atoppi I have been running some tests for the last hour and everything is working fine. A single message is received while connecting and I did not see any repetitions.

My use case requires to send short messages from time to time (that is why I needed the last saved message) but I am going to try running some extra tests this weekend, increasing both the size and the frequency. I will let you know if I find any bugs.

Many thanks

atoppi commented 1 month ago

@ares1977 I'll assume you are testing the PR updated to the latest commit. Anyway since it seems to work I'll remove the draft status to the PR.

ares1977 commented 1 month ago

@atoppi yes, I made the exact same change when you suggested it. There are no differences in between the code I am testing and the one in the PR (including the fourth commit).

atoppi commented 3 days ago

@ares1977 the patch has been merged. Thanks for reporting.