jitsi / jitsi-videobridge

Jitsi Videobridge is a WebRTC compatible video router or SFU that lets build highly scalable video conferencing infrastructure (i.e., up to hundreds of conferences per server).
https://jitsi.org/jitsi-videobridge
Apache License 2.0
2.9k stars 988 forks source link

CPU usage jumps to 50% for just three users #1240

Open steebchen opened 4 years ago

steebchen commented 4 years ago

Description

I have deployed the Jitsi video bridge according to the install docs on jitsi meet.

Current behavior

The CPU usage is low (<1%) when there are no meetings as expected. However, when just 3 people join a single meeting, the CPU usage immediately jumps to 50%. This seems abnormally high if you think about that only a few more people would fill the maximum capacity of 400% CPU usage (with 4 cores). The CPU increases when more meetings are created or more people join the call.

Screenshot 2020-05-03 at 02 21 26 Screenshot 2020-05-03 at 02 21 13

Expected Behavior

A lower CPU usage.

Possible Solution

Steps to reproduce

1) Install the Videobridge. I used a Hetzner 15€/month machine (CPX31) at https://www.hetzner.de/cloud. 2) Run a meeting with 3 people/devices.

Environment details

Jitsi Meet videobridge: 2.1-183-gdbddd169-1 Ubuntu 18.04.4 LTS (GNU/Linux 4.15.0-91-generic x86_64) A Hetzner 15€/month machine (CPX31) https://www.hetzner.de/cloud

awlx commented 4 years ago

Those machines can be very over subscribed so could be just a symptom. We don't see those spikes on our setup and we can run 100 - 120 users without a problem on a 4 core machine if the machine is not a shared VM but reserved instance.

15 Euro is still on the lower end of the price range.

steebchen commented 4 years ago

I'll try with a dedicated server at some point, but usually Hetzner's VPS are pretty robust – I have tons of stuff running not these.

Maybe a more general question: Why does it need so much CPU in the first place? In theory, doesn't the server just need to forward video from one client to another client, and if there are two clients, forwarding the video to two clients? Where does the CPU usage exactly go?

damencho commented 4 years ago

Every packet is being decrypted and encrypted again.

532910 commented 4 years ago

10 people takes about 90% of one Xeon core (E3-1240v2 3.40GHz) in the KVM guest on my own server. SSL is handled by nginx outside this guest.

awlx commented 4 years ago

@532910 videotraffic still gets encrypted and decrypted on the video bridge.

532910 commented 4 years ago

It means that jitsi has MCU topology, am I right?

Is there some real expected values? (For example, meet.jit.si statistics: CPU load per number of peers)

Is there a way to offload this, for example with GPU?

What about SFU topology?

bgrozev commented 4 years ago

It means that jitsi has MCU topology, am I right?

No, it is an SFU. It does not decode/encode, but it does decrypt/encrypt.

532910 commented 4 years ago

But what it decrypts end encrypts? SSL is handled by external nginx. E2E encryprion is not used.

steebchen commented 4 years ago

It also seems weird to me. I didn't deploy Jitsi to a dedicated server yet, but I have servers for other projects which are constantly running at 500mbit/s inbound and outbound (serving large media files), and they are also decrypted when receiving and encrypted when sending, but the CPU usage is at a few percent. This seems normal; I wouldn't expect SSL to use a lot of CPU with just 500mbit/s throughput.

bgrozev commented 4 years ago

But what it decrypts end encrypts? SSL is handled by external nginx. E2E encryprion is not used.

SRTP (E2E encryption has no relevance here, its very purpose is to not allow the SFU to do it)

532910 commented 4 years ago

Yep, E2E encryption has no relevance here, it's my fault. And I had to guess that this is SRTP.

So looks like it doesn't use AES-NI, does it?

532910 commented 4 years ago

real values:

% openssl speed -elapsed aes-128-cbc
The 'numbers' are in 1000s of bytes per second processed.
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128 cbc     202124.05k   207632.98k   210351.19k   211711.66k   217721.51k   211823.27k

% openssl speed -elapsed -evp aes-128-cbc
...
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc     519162.55k   643730.20k   679249.66k   689552.73k   694296.58k   693217.96k

% cryptsetup benchmark 
...
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       570.1 MiB/s      1861.3 MiB/s
...
532910 commented 4 years ago

hm, but #67 says it should use hardware aes

532910 commented 4 years ago

As I've found it's just not enabled in openjdk on debian buster:

% java -XX:+PrintFlagsFinal -version | grep -i UseAES
     bool UseAES                                   = true                                      {product} {default}
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1)
OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Debian-3deb10u1, mixed mode, sharing)

% _JAVA_OPTIONS='-XX:+UnlockDiagnosticVMOptions -XX:+UseAESIntrinsics' java -XX:+PrintFlagsFinal -version | grep -i UseAES
Picked up _JAVA_OPTIONS: -XX:+UnlockDiagnosticVMOptions -XX:+UseAESIntrinsics
     bool UseAES                                   = true                                      {product} {default}
     bool UseAESCTRIntrinsics                      = true                                   {diagnostic} {default}
     bool UseAESIntrinsics                         = true                                   {diagnostic} {environment}
openjdk version "11.0.7" 2020-04-14
OpenJDK Runtime Environment (build 11.0.7+10-post-Debian-3deb10u1)
OpenJDK 64-Bit Server VM (build 11.0.7+10-post-Debian-3deb10u1, mixed mode, sharing)
  1. I add it to the /etc/jitsi/videobridge/config. Is this the proper place?
  2. ps aux | grep jvb shows that jvb uses java with a lot of flags that aren't specified in /etc/jitsi/videobridge/config neither /etc/init.d/jitsi-videobridge2. Where they come from?
bgrozev commented 4 years ago

As I've found it's just not enabled in openjdk on debian buster:

It uses openssl. See jitsi-srtp and SrtpPerfTest in paricular.

JonathanLennox commented 4 years ago

Yes, x86_64 Linux platforms should be using OpenSSL's libcrypto for their SRTP cryptography. If you see

INFO: OpenSslWrapperLoader.<clinit>#46: jitsisrtp successfully loaded

in your logs, it was loaded successfully.

(Note the library is loaded on-demand when SRTP is used, so this log message will only be emitted when you start an actual conference.)

532910 commented 4 years ago

I see INFO: OpenSslWrapperLoader.<clinit>#46: jitsisrtp successfully loaded in the /var/log/jitsi/jvb.log no matter is -XX:+UseAESIntrinsics added into /etc/jitsi/videobridge/config or not.

So it either does not use HW AES (via OpenSSL) or the CPU load is not due to encryption.

532910 commented 4 years ago

-XX:+UseAESIntrinsics changes nothing

robojones commented 4 years ago

So it's not caused by the encryption?

robojones commented 4 years ago

This must be some kind of bug.

532910 commented 4 years ago

Is there any way to clarify what's really happening and why it eats CPU?

saghul commented 4 years ago

Yes, you can use the performance monitor in the dev tools. We know the audio levels are one of the offenders and will be working to make it better.

532910 commented 4 years ago

performance monitor in the dev tools

Saúl, sorry, I can't find it, could you point me?

But I found Jitsi Videobridge Performance Evaluation, that says

On a plain Xeon server (like this one) that you can rent for about a hundred dollars, for about 20% CPU you will be able to run 1000+ video streams using an average of 550 Mbps! Check the graph below!

So this issues is definitely a bug.

saghul commented 4 years ago

Oh, sorry, I'm an idiot, I thought this was the electron app repo :-(

532910 commented 4 years ago

Could you help me to troubleshoot this please?

awlx commented 4 years ago

You could generate a flamegraph with perf and look where the CPU time is wasted.

For us it looks like this and most time is spent on Network_RX.

Screenshot 2020-06-24 at 10 10 03
532910 commented 4 years ago

Where perf can be found?

awlx commented 4 years ago

it's a Linux tool to do exactly this job. https://medium.com/@maheshsenni/java-performance-profiling-using-flame-graphs-e29238130375

awlx commented 4 years ago

Here is a Flamegraph for a videobridge with 26 Users and a Load of 0.99 on a 8 Core machine:

graph.svg.zip

awlx commented 4 years ago

So if we Zoom in we see most time is consumed for package processing:

Screenshot 2020-06-24 at 10 38 29
robojones commented 4 years ago

@awlx So, what does this mean?

awlx commented 4 years ago

That most time is spent in packet processing as expected. You could basically rewrite that to use a more efficient stack like XDP or something. But that would be a huge effort and your system constraint would still be Network hardware and Interrupt Performance.

So basically the advice is use dedicated hardware or not overbooked VMs and it performs well. As everything which has to do with high packet rates.

https://developers.redhat.com/blog/2018/12/06/achieving-high-performance-low-latency-networking-with-xdp-part-1/

pdarcos commented 4 years ago

Is this the bug that was mentioned in the community call last Monday or is this something different? https://www.youtube.com/watch?v=XjHKp_rxAd0

tom666-debug commented 3 years ago

Same observation here. We are running Jitsi in a dockerized environment behind a Apache reverse proxy.

With just 2 users cpu load for the jvb-container is around ~2%. As soon as a 3rd user connects, cpu load jumps to ~50%.

Looking at the connection details I can see that the connection is using the UDP port 10000 as soon as more than 2 users are connected. With just 2 user some other UDP high ports (5xxxx or 6xxxx) are used. Just wondering if this connected to the cpu load?!

Someone already identified the culprit/a solution for this and can point me into the right direction?

532910 commented 2 years ago

Finally I did it. The instruction on medium from the comment above was too complicated, so I found another one, with only 5 command to get svg flamegraph:

532910 commented 2 years ago

perf_flame