meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
7.98k stars 2.45k forks source link

[1.x]inconsistency happen when more than 6 users join at same time with different browsers (web and mobile) #3363

Closed venkateshcontus closed 1 week ago

venkateshcontus commented 2 months ago

What version of Janus is this happening on? 1.2.3

Have you tested a more recent version of Janus too? Yes.

Was this working before? No

Is there a gdb or libasan trace of the issue? Join more than 6 users at same time you feel the inconsistency.We are not trying in the same browser. I have a team with 10 members tried in web and mobile app. In our application we have schedule meeting feature to create meeting link and share it to my members and join at same time.

Additional context All the users does not get the all members streams. users get partial members stream. Joining time also increasing based on the members count. last users took 10 to 20 sec to join the room.

Could you share the best configuration with 8 members.

atoppi commented 1 month ago

All the users does not get the all members streams. users get partial members stream.

Share a janus message/events log for a client with this issue. I suspect this is something related to client code.

Joining time also increasing based on the members count. last users took 10 to 20 sec to join the room

10 users joining at the same time is a quite common scenario. I never experienced a 20 secs waiting, again this smells like issue with the application code.

Could you share the best configuration with 8 members.

No idea what is the question.

lminiero commented 1 month ago

@venkateshcontus any update on what Alessandro asked?

venkateshcontus commented 1 month ago

Hi,

 Please wait will share the logs
venkateshcontus commented 1 month ago

Hi,

This is the Janus server configuration we are using, with this configuration while joining 8 members in a room we are facing the issue mentioned earlier.

cpu cores : 1 vCPU : 2 Memory : 6GB Download: 1020.72 Mbit/s Upload: 1004.01 Mbit/s Ping : 64 bytes from 8.8.8.8: icmp_seq=1 ttl=60 time=11.4 ms Ping : 64 bytes from 1.1.1.1: icmp_seq=3 ttl=60 time=1.34 ms

https://mf-janus-test.mirrorfly.com/?room=1 - more than 6 members join the same room with their own system we got the latency.

lminiero commented 1 month ago

This is not the information Alessandro asked for. Please share the message flow for a client with issues.

atoppi commented 1 month ago

On top of that, even mimicking the layout, that web app is not the one we have on our demos.

venkateshcontus commented 1 month ago

Hi,

 Yes, we changed the limit size from 6 to 10 from your default demo page. Please find the event log in one of the client.It took almost 1.5 mins to render all the 9 users video.

websocket_messages.json

lminiero commented 1 month ago

This log seems to suggest some incorrect or broken network configuration you may have. We see, for instance, that it takes close to a minute for an attached event to be sent back in response to a new subscription. This could happen if a push_event with an SDP offer takes a long time to return, which can happen if Janus is configured to do half-trickle (the default) and it takes Janus a long time to gather all the candidates.

Apparently you're using some sort of docker or k8s environment (at least looking at the host address), and you have STUN enabled in Janus (which is usually not recommended, but that's a separate matter). If gathering the host candidate takes a long time, or the STUN server takes a long time to respond, push_event will sit there waiting until all candidates have been gathered before preparing the final SDP offer and sending it. For host candidates, the cause may be a network interface that is not working properly, or may not be configured correctly. You can check if that's the case by enabling full_trickle = true in janus.jcfg (which is in the nat section). You can also try disabling the STUN usage in Janus to see if that speeds up signalling too (in which case it would mean it's Janus waiting for the STUN server to respond that's the cause of the problem). If you can, if it's docker/k8s try configuring those instances to use host networking.

Please let us know if anything changes with that configuration. My assumption is that signalling would not be blocked as it is now, but you'd still have long setup times (caused by the very slow gathering of candidates). If that's the case, then it would be up to you to fix the network configuration.

lminiero commented 1 week ago

@venkateshcontus any update on this? If not, we'll just close the issue as there isn't any, from our perspective.

venkateshcontus commented 1 week ago

@lminiero Thanks for your support. After enabling full_trickle = true in janus.jcfg. It works as expected. All the 8 to 10 members are joined quickly. Please close this ticket.