meetecho / janus-gateway

Janus WebRTC Server
https://janus.conf.meetecho.com
GNU General Public License v3.0
8.25k stars 2.48k forks source link

[1.x] #3223

Closed xunlujiehuo closed 1 year ago

xunlujiehuo commented 1 year ago

What version of Janus is this happening on? 1.13

Have you tested a more recent version of Janus too? no

Is there a gdb or libasan trace of the issue?

1daf175294747f9f9416e9a4df530a8

Additional context I used a pressure gauge tool to pressure videoroom. When I had 1 push stream and 500 push streams, the janus cpu shot up to 1500%. My environment was 16 core. When I did the same thing, pull stream 400, the cpu was only 200% more. This is what I tracked with the perf tool and I used a pressure gauge tool to pressure videoroom, and when I had 1 push stream, 500 push streams, the janus cpu would go up to 1,500%, my environment was 16 cores, and when I did the same thing, pull stream 400, the cpu was just over 200%. This is what I tracked with the perf tool. I don't know why the CPU has soared so much. janus is deployed on a public network environment. In the same case, if janus is on the inner network, the pull flow can reach more than a thousand

xunlujiehuo commented 1 year ago

I also encountered CPU surge when I performed other push-stream or pulse-stream joint tests on janus. I don't understand, just a small difference in the number, why the cpu usage difference is so large, which makes me confused and frustrated

lminiero commented 1 year ago

This is the second issue you open without title and without providing useful information. I'm sorry you feel frustrated, but what makes ME feel frustrated is people who just attach screenshots and not data we can use to find out what's wrong. I asked what the tool was and you didn't answer me in the other ticket either. If you want our help figuring out what's wrong, please take the time to give us more data to start from, and data we can parse (no images).

lminiero commented 1 year ago

That said, the only thing that we could understand from your image is that it looks like it's spending a lot of time in g_slist_last, so something that has to do with a GSList. We have one in the core for NACKs, and so if the sender sends a ton of NACKS (again, no idea what the pressure tool is so we can just speculate) this can cause a lot of appends, which have to traverse the whole list. I pushed a commit that should address this, so in case you make other tests please use that one or later commits, and not a previous version.