jitsi / jibri

Jitsi BRoadcasting Infrastructure
Apache License 2.0
609 stars 313 forks source link

Ffmpeg eats all the memory and crash within a minute - recording or streaming #269

Closed Arzar closed 2 years ago

Arzar commented 4 years ago

Description

On jitsi when I start a recording or a streaming session, in less than a minute the recording/stream will stop and my whole server become slow and unresponsive.

With top, I could pin the culprit: ffmpeg. It eats away all the memory very quickly. In less than a minute my 8GB are filled.

You can find attached the log of jibri when I tried a streaming session. Nothing stands out to me. I stopped the streaming after 15 seconds and ffmpeg was already at 40% memory.

Also if I stop completely prosody, jicofo, jvb and jibri and if I log as a jibri user and starts ffmpeg by myself, using the command I found in log.0.txt, I get the same issue, the CPU shoot to 150% and the memory keeps growing. I have to kill ffmpeg before it saturates the memory.

ffmpeg -y -v info -f x11grab -draw_mouse 0 -r 30 -s 1280x720 -thread_queue_size 4096 -i :0.0+0,0 -f alsa -thread_queue_size 4096 -i plug:bsnoop -acodec aac -strict -2 -ar 44100 -c:v libx264 -preset veryfast -maxrate 2976k -bufsize 5952k -pix_fmt yuv420p -r 30 -crf 25 -g 60 -tune zerolatency -f flv rtmp://a.rtmp.youtube.com/live2/aaa

If I remove every parameters related to sound in this ffmpeg command line, so removing -f alsa -thread_queue_size 4096 -i plug:cloop -acodec aac, then the memory saturation issue goes away. Memory usage is stable. So It clearly seems to be related to the sound. How can I debug this kind of issue ?

Possible Solution


Steps to reproduce


Environment details

Ubuntu 16, followed the instruction on github

lsmod | grep snd_aloop
snd_aloop              24576  0
snd_pcm               106496  1 snd_aloop
snd                    81920  3 snd_aloop,snd_timer,snd_pcm

jibri@JibriTestSrv:/root$ aplay -l
**** List of PLAYBACK Hardware Devices ****
card 0: Loopback [Loopback], device 0: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 0: Loopback [Loopback], device 1: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7

browser.0.txt log.0.txt ffmpeg.0.txt asoundrc.txt

rfdparker commented 4 years ago

We seem to have found similar or the same issue with Jitsi/Jibri on Debian 10, so far as regards the memory usage of ffmpeg increasing indefinitely until all real memory and swap is used until the kernel OOM killer kills the ffmpeg processes.

That said, I've not yet tried running without ffmpeg being passed the arguments -f alsa -thread_queue_size 4096 -i plug:cloop -acodec aac as suggested. How does one override what arguments Jibri uses to start ffmpeg?

Arzar commented 4 years ago

memory usage of ffmpeg increasing indefinitely until all real memory and swap is used until the kernel OOM killer kills the ffmpeg processes.

Seems to be it ! Also, a third person reported the same on the forum

How does one override what arguments Jibri uses to start ffmpeg?

In the general case, I think you need to modify the jibri source code. In my case, just for testing purpose I did the following quick hack: rename /usr/bin/ffmpeg to /usr/bin/ffmpeg-original and then create the following /usr/bin/ffmpeg script

#!/bin/bash
PARAMS=$@
echo $PARAMS > /tmp/param.txt
PARAMNOSOUND="$(echo $PARAMS |  sed 's/-f alsa.*aac//' | sed 's/ffmpeg//g' )"
echo $PARAMNOSOUND >>  /tmp/param.txt
ffmpeg-original $PARAMNOSOUND

So it's not a solution by any means, because jibri can't even stop ffmpeg at the end of a session anymore, just a quick check to confirm sound is the issue.

bbaldino commented 4 years ago

~Do you have an ffmpeg log of when this happens? We might be seeing some instances of this as well.~ Oops, read right past the attachments. The failure mode in those logs is different than what I was thinking of, though (and I definitely haven't seen anything like this after such a short amount of time).

ec-blaster commented 4 years ago

I have the exact same issue. We are using an independent machine for Jibri. Debian 10, 2 vCPUs and 8GB RAM. It eats up all the memory and all the swap, and ffmpeg crashes a while after all the memory is taken.

NBoESFWbVaf commented 4 years ago

Here the same. VirtualServer with U18.04, 4 Cpu's, 8GB RAM Very interesting is, if i set "disableThirdPartyRequests: true," (Gravatar) in /etc/jitsi/meet/meet.mydomain.com-config.js my memory usage is stable.

Can anybody confirm this?

ec-blaster commented 4 years ago

I can confirm that, when you disable third party requests, the memory usage seems to be stable. I did a test for about 10 minutes and it stayed below 1GB. Thanks!

NBoESFWbVaf commented 4 years ago

Ok, but that only seems to have been the beginning. After 26-27min, the memory went through the roof from 1.5GB to 8GB and Swap. The same stream, no interaction, hardly any sound.

rfdparker commented 4 years ago

Here the same. VirtualServer with U18.04, 4 Cpu's, 8GB RAM Very interesting is, if i set "disableThirdPartyRequests: true," (Gravatar) in /etc/jitsi/meet/meet.mydomain.com-config.js my memory usage is stable.

Can anybody confirm this?

We have have tried setting disableThirdPartyRequests: true, however it did not seem to resolve the issue unfortunately.

rfdparker commented 4 years ago

memory usage of ffmpeg increasing indefinitely until all real memory and swap is used until the kernel OOM killer kills the ffmpeg processes.

Seems to be it ! Also, a third person reported the same on the forum

How does one override what arguments Jibri uses to start ffmpeg?

In the general case, I think you need to modify the jibri source code. In my case, just for testing purpose I did the following quick hack: rename /usr/bin/ffmpeg to /usr/bin/ffmpeg-original and then create the following /usr/bin/ffmpeg script

#!/bin/bash
PARAMS=$@
echo $PARAMS > /tmp/param.txt
PARAMNOSOUND="$(echo $PARAMS |  sed 's/-f alsa.*aac//' | sed 's/ffmpeg//g' )"
echo $PARAMNOSOUND >>  /tmp/param.txt
ffmpeg-original $PARAMNOSOUND

So it's not a solution by any means, because jibri can't even stop ffmpeg at the end of a session anymore, just a quick check to confirm sound is the issue.

Thanks for the suggestion. We have tried a similar script (in /usr/local/bin) to remove the unwanted parameters to ffmpeg. That does seem to stop the memory usage growing for both recording and streaming. However with that in place YouTube does not report it is receiving a stream (of course a stream with no sound would not be too useful anyhow).

NBoESFWbVaf commented 4 years ago

Ok, here's another try. I've changed the ffmpeg version which is come from Ubuntu 18.04 Repo in version

ffmpeg version 3.4.6-0ubuntu0.18.04.1 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7 (Ubuntu 7.3.0-16ubuntu3)

up to...

ffmpeg version 4.2.2-1ubuntu1~18.04.york0 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)

with

sudo apt install software-properties-common sudo add-apt-repository --yes ppa:jonathonf/ffmpeg-4 sudo apt update sudo apt install ffmpeg

Memory is the mostly time 1.7GB. Sometimes it goes up to 3GB for no apparent reason, but quickly returns to 1.7GB.

Edit: CPU load 5-10% better.

NBoESFWbVaf commented 4 years ago

After 30 min without reason out of memory and swap. Incredible.

NBoESFWbVaf commented 4 years ago

Ok, I'm breaking off now. Video recording is great feature, but not particularly important for me at the moment. 50min stream no problem. After restarting the server, it no longer works properly. I set it up to a working videochat without recording stream. Bye Guys and good luck.

ec-blaster commented 4 years ago

For us, it's specially important. I tried today the parameter you sent yesterday and yes, it's stable for about 25-30 mins and then crashes again with full memory. Today, I'll try to compile my own ffmpeg...

NBoESFWbVaf commented 4 years ago

I no longer thought about compiling it myself. Then you should also be able to set your own parameters. Definitely exciting. I'll keep my fingers crossed.

snoopytfr commented 4 years ago

Hi,

i have the same problem, after a moment memory go full and crash ffmpeg.

ec-blaster commented 4 years ago

I finally compiled the latest version of ffmpeg and now everything is fine. I made a test recording of a whole hour and the memory usage stayed at about 3GB, with no other incident nor any warning or error in logs. I think that combination of Jibri and the version of ffmpeg bundled with Debian Buster is the culprit of the error.

rfdparker commented 4 years ago

I finally compiled the latest version of ffmpeg and now everything is fine. I made a test recording of a whole hour and the memory usage stayed at about 3GB, with no other incident nor any warning or error in logs. I think that combination of Jibri and the version of ffmpeg bundled with Debian Buster is the culprit of the error.

It's good to hear you found a way to resolve this; although we haven't been able to replicate your fix so far.

Which version of ffmpeg did you compile? Is this still on Debian 10?

We too are running Debian 10 (buster). Rather than actually compile ffmpeg we tried the FFmpeg Static Builds, placing the static binaries (ffmpeg, ffprobe and qt-faststart) into /usr/local/bin such that they take precedence over those of the ffmpeg Debian package (which is installed as dependency of the jibri package).

Firstly we tried release 4.2.2, which seemed to result in the same behaviour as before ‒ the ffmpeg process would use increasingly more memory until both real memory and swap are exhausted and the kernel OOM killer kills ffmpeg.

Second we tried the git master build dated 20200324 (ffmpeg -version reports its version as N-52056-ge5d25d1147-static). This appeared to stabilise the memory usage of the ffmpeg process and stop it growing indefinitely. However we seemed to get a new issue where the memory usage of the Xorg process (belonging to jibri-xorg.service) grows indefinitely akin to what the ffmpeg process was previously doing. Ultimately the kernel OOM killer kills Xorg.

As an aside, before your last comment I'd thought the ffmpeg version was unlikely to be the cause given that we seem to have a variety of versions floating around including those in the repos of Ubuntu 16.04, Ubuntu 18.04 and Debian 10 as well as some from third-party repos. In addition the install guide at https://github.com/jitsi/jibri refers to adding a repo to get a newer ffmpeg version when using Ubuntu 14.04 (which is EOL since last year) but that only provides ffmpeg version 3.4.0 which is older than the versions from the repos of Ubuntu 18.04 and Debian 10.

ec-blaster commented 4 years ago

I finally compiled the latest version of ffmpeg and now everything is fine. I made a test recording of a whole hour and the memory usage stayed at about 3GB, with no other incident nor any warning or error in logs. I think that combination of Jibri and the version of ffmpeg bundled with Debian Buster is the culprit of the error.

It's good to hear you found a way to resolve this; although we haven't been able to replicate your fix so far.

Which version of ffmpeg did you compile? Is this still on Debian 10?

We too are running Debian 10 (buster). Rather than actually compile ffmpeg we tried the FFmpeg Static Builds, placing the static binaries (ffmpeg, ffprobe and qt-faststart) into /usr/local/bin such that they take precedence over those of the ffmpeg Debian package (which is installed as dependency of the jibri package).

Firstly we tried release 4.2.2, which seemed to result in the same behaviour as before ‒ the ffmpeg process would use increasingly more memory until both real memory and swap are exhausted and the kernel OOM killer kills ffmpeg.

Second we tried the git master build dated 20200324 (ffmpeg -version reports its version as N-52056-ge5d25d1147-static). This appeared to stabilise the memory usage of the ffmpeg process and stop it growing indefinitely. However we seemed to get a new issue where the memory usage of the Xorg process (belonging to jibri-xorg.service) grows indefinitely akin to what the ffmpeg process was previously doing. Ultimately the kernel OOM killer kills Xorg.

As an aside, before your last comment I'd thought the ffmpeg version was unlikely to be the cause given that we seem to have a variety of versions floating around including those in the repos of Ubuntu 16.04, Ubuntu 18.04 and Debian 10 as well as some from third-party repos. In addition the install guide at https://github.com/jitsi/jibri refers to adding a repo to get a newer ffmpeg version when using Ubuntu 14.04 (which is EOL since last year) but that only provides ffmpeg version 3.4.0 which is older than the versions from the repos of Ubuntu 18.04 and Debian 10.

I compiled the latest ffmpeg available as of yesterday, following the guide at:

https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu

We are at Debian Buster, and I used the following settings:

The resulting ffmpeg version is N-97465-g4e81324

I just tried to record a session for an entire hour, and it worked very stable. The used memory stayed at 3.03 GB all the time (the machine has 8GB) and the recording was fine. Hope it helps

ec-blaster commented 4 years ago

I tested now streaming to youtube. And now it crashed again. Memory full

pdarcos commented 4 years ago

Has this been fixed?

I was about to install jibri but after reading this I'm a bit weary of how much memory is actually needed.

Anyone have a working setup? If so, what hardware specs?

Thanks

pdarcos commented 4 years ago

To answer my own question in order to help others, no it hasn't been fixed.

I recorded a 1.5 hour long conference yesterday and I had a bunch of errors like the ones described here. I'll see if I can dig into the logs when I have some time and post them here.

This is very disappointing as there doesn't seem to be any solution available.

igorayres-falepaco commented 4 years ago

I having this same problem...

VengefulAncient commented 4 years ago

We have the same issue on Kubernetes (GKE), but strangely enough, not Docker. The Kubernetes Jibri deployment managed to OOM a node with 6.5 GB RAM with a single livestream. The Docker deployment is running at 1.15 GB. Both are using exactly the same ffmpeg version, since they're using the same Jibri docker image.

rfdparker commented 4 years ago

We may have found a solution/workaround for this, although it's a surprising one.

In a nutshell, it's using a Java 8 JRE instead of a Java 11 JRE.

This despite the high memory usage we've seen before being by either FFmpeg or Xorg, neither of which are Java processes. The Jibri service truns in a JRE, though.

We came across this when looking for differences between the Jibri Docker image and our 'native' (i.e. not container based) installation on Debian 10. The Jibri docker image is based on Debian 9 where default-jre pulls in openjdk-8-jre (an OpenJDK 8 JRE), whereas on Debian 10 it brings in openjdk-11-jre (an OpenJDK 11 JRE).

Another clue was Woodworker_Life's How-to to setup integrated Jitsi and Jibri for dummies, my comprehensive tutorial for the beginner thread on the Jitsi Community Forum. There Woodworker_Life appears to literally say that Jibri won't work with a Java 11 JRE, although not specifically how/why.

There is no OpenJDK 8 JRE in the Debian 10 repos and so, like in Woodworker_Life's thread, we used the OpenJDK 8 JRE package for Debian from the AdoptOpenJDK project.

So far we've several long attempts at recording and streaming (to YouTube), albeit most only with 2 participants. For at least one we've run for over an hour. In no case of those cases have we had memory issues or crashes. Memory seems to hold steady sometimes as low as 6xx MB, but always less than 2 GB so far. That said, for real-world use where we'll have more participants (albeit < 10) we won't skimp on memory allocated for the VM used (we'll have 6 GB).

We'll shortly be building and working with the server for 'real world' use, so we'll update if we again encounter issues ‒ but, fingers crossed, it's looking good!

sanvila commented 4 years ago

My case: Debian 10 with Java 8. Test: youtube live streaming. Machines from GCE.

2 vCPUs and 6GB RAM -> eventual crash 4 vCPUs and 4GB RAM -> smooth and low memory usage (below 1GB as reported by grafana)

My theory is that a low number of CPUs makes ffmpeg not to be able to process data as fast as it should.

ec-blaster commented 4 years ago

We had a very frustrating experience this week with Jibri.

We had tested it with our latest specifications (Debian 10, Jre 8, 4 vCPUs and 12GB RAM) for several recordings that went well, so we decided to go on and scheduled a very important meeting to stream it to YouTube.

The meeting was for 27 participants, 14 of them visible (LastN=14) and was going to be very long (about 5 hours). When the meeting was heading to the first hour, we had to stop the Jibri because all memory (12GB) and swapping (8GB) was eaten. After several attempts of stopping and relaunching the streaming (we have 2 cloned jibris) we had to set 24GB and 6 vCPUs and then all went smooth... But the damage to our reputation was already done...

bbaldino commented 4 years ago

We had a very frustrating experience this week with Jibri.

We had tested it with our latest specifications (Debian 10, Jre 8, 4 vCPUs and 12GB RAM) for several recordings that went well, so we decided to go on and scheduled a very important meeting to stream it to YouTube.

The meeting was for 27 participants, 14 of them visible (LastN=14) and was going to be very long (about 5 hours). When the meeting was heading to the first hour, we had to stop the Jibri because all memory (12GB) and swapping (8GB) was eaten. After several attempts of stopping and relaunching the streaming (we have 2 cloned jibris) we had to set 24GB and 6 vCPUs and then all went smooth... But the damage to our reputation was already done...

Have you searched around for anything on the ffmpeg side for this? Seems like such an odd bug. I don't think it can be a fundamental parameter issue, as we're not seeing this consistently, but maybe a combination of some setting we pass to ffmpeg and some network condition?

agustinramirez commented 4 years ago

we have the same issue, our configuration is Ubuntu Server 16.04 and ffmpeg 2.8.15-0ubuntu0.16 and jibri 8.0.30,any solution? By the way in other enviroment with the same ffmpeg and ubuntu versions this does not happen , the difference is the Jibri Version in this enviroment is 7.2.71-1

kpeiruza commented 4 years ago

We've found the same issue on a local Kubernetes cluster.

It's an Ubuntu 18.04 based cluster, with Jibri compiled 6 weeks ago from testing release.

Jibri was recording directly into a NFS folder.

In our case, the Kernel wasn't flushing cache memory quickly enough so OOM_Killer got triggered. We fixed that situation by moving the recordings folder to a local folder of the node and then moving the recording to NFS.

This way, the Kernel is behaving properly and never fills the machine, despite RAM consumption is huge, anyway (Cache, not RSS).

What's weird is that the exact same Jibri deployment works fine on other Kubernetes, so, maybe it's related to something else (base Kernel, CPU power....).

Still investigating.

In any case, maybe you can try to increase cache pressure in your kernels to avoid filling up your memory: vm.vfs_cache_pressure=150 or 200

agustinramirez commented 4 years ago

@kpeiruza I increased cache pressure in my Ubuntu kernels and and it didn't work. this problem only occurs when using a virtual machine locally on our server, we have tried with instances of ec2 and this problem does not happen, in ec2 instance jibri works fine! with the same versions of jibri, ffmpeg and ubuntu server both locally and in aws, locally. on local virtual machines: jibri crashed, but on ec2 instances in AWS jibri works fine

nguyentthai96 commented 4 years ago

I have same issue, the server stop after live stream youtube some minute. I install jitsi (all in server) in the Azure cloud.

igorayres-falepaco commented 4 years ago

Hello! Is there any solution for this problem?

VengefulAncient commented 4 years ago

I can confirm that setting disableThirdPartyRequests: true is the only thing that fixed this for us. Running a livestream on a Jibri with 2 vCPUs and 4 GB RAM results in 60-90% CPU usage on all cores, and 1.2-1.5 GB RAM usage. 4 vCPUs and 4 or more GB RAM is 30-45% CPU usage on all cores, RAM usage same as in the previous configuration.

Without this option disabled, it eats up all RAM and crashes no matter the hardware. Curiously, this seems to only happen on Kubernetes (GKE) and not pure Docker on a VPS. (Jitsi team: you can't dismiss Kubernetes as "community supported" forever. People are moving on from pure Docker. You should too.)

agustinramirez commented 4 years ago

I can confirm that setting disableThirdPartyRequests: true is the only thing that fixed this for us. Running a livestream on a Jibri with 2 vCPUs and 4 GB RAM results in 60-90% CPU usage on all cores, and 1.2-1.5 GB RAM usage. 4 vCPUs and 4 or more GB RAM is 30-45% CPU usage on all cores, RAM usage same as in the previous configuration.

Without this option disabled, it eats up all RAM and crashes no matter the hardware. Curiously, this seems to only happen on Kubernetes (GKE) and not pure Docker on a VPS. (Jitsi team: you can't dismiss Kubernetes as "community supported" forever. People are moving on from pure Docker. You should too.)

I already tried this solution to record the video conference and it still does not solve the problem, my configuration is Ubuntu 16.04 in virtual machine 4 core and 8 gb RAM, any other solution?

pdarcos commented 4 years ago

Has there been any real update to this issue?

It seems that if affects everyone, including the jitsi-meet deployment. Just watch the video from the last jitsi community call from last week and you'll see how painful it is to watch with the recorder constantly crashing. This is exactly what we all seem to be experiencing yet it seems there's still no real solution or even a good idea of what the root of the problem actually is.

Has the jitsi team managed to find a fix for their own deployment yet?

kpeiruza commented 4 years ago

Hi @pdarcos

I'm not from the Jitsi team but, TBH, it's quite difficult to reproduce across different environments. We've been using our own compiled Jibri without issues in K8s with Scaleway and Azure (to be tested this week in GKE), but on VSphere, the same docker image eats all ram and crashes the pod.

I'm going to try some tweaks with ffmpeg settings as well as /dev/shm, because the first thing we nottice is the memory leak.

I wish we can find a solution that works across all environments. In the meantime, feel free to use my docker image: kpeiruza/jibri:v12 , built with ffmpeg 2 months ago, and tell me if you experience the same issues.

Here I attach a simple deployment.yaml for K8s, so you can see which variables do you need to provide to the Docker Image.

apiVersion: apps/v1 kind: Deployment metadata: labels: app: jibri flavour: allinone name: jibri namespace: jitsi spec: progressDeadlineSeconds: 600 replicas: 4 revisionHistoryLimit: 10 selector: matchLabels: app: jibri strategy: rollingUpdate: maxSurge: 25% maxUnavailable: 25% type: RollingUpdate template: metadata: creationTimestamp: null labels: app: jibri name: jibri spec: containers:

Regards,

Kenneth

kpeiruza commented 4 years ago

PS: working against local volume improved RAM consumption. We won like 20 minutes.

PPS: Scaleway is using KVM virtualization, I guess Azure is using Hyper-V, so, it looks like the problem is arising i Oracle Cloud or AWS (is meet.jit.si already migrated into Oracle?) and VSphere.

pdarcos commented 4 years ago

Hi @kpeiruza Thanks for the share. I'm using jibri on a Hetzner CX31 cloud server (KVM) with 2 VCPUs and 8GB RAM running Debian 10 and the behavior is very inconsistent (not using the docker images in my case). Sometimes I'm able to record meetings that last for hours without any problems, but often times it keeps crashing randomly and repeatedly and I can't even record a few minutes. Very frustrating as it is very hard to pinpoint the source of the problem.

Cheers

agustinramirez commented 4 years ago

any solution?

ec-blaster commented 4 years ago

Our only real solution was to increase vCPU's number. Increasing from 4 to 8 CPU's was the fix we used last time. It seems as if ffmpeg begins eating memory when the CPU's are not giving enough power to it...

agustinramirez commented 4 years ago

Our only real solution was to increase vCPU's number. Increasing from 4 to 8 CPU's was the fix we used last time. It seems as if ffmpeg begins eating memory when the CPU's are not giving enough power to it...

A question what is the size of the RAM that they use for this configuration that you mention?

pdarcos commented 4 years ago

Our only real solution was to increase vCPU's number. Increasing from 4 to 8 CPU's was the fix we used last time. It seems as if ffmpeg begins eating memory when the CPU's are not giving enough power to it...

A question what is the size of the RAM that they use for this configuration that you mention?

Good question. I'd also like to know how much RAM @ec-blaster is using on his deployment

starkwiz commented 4 years ago

Our only real solution was to increase vCPU's number. Increasing from 4 to 8 CPU's was the fix we used last time. It seems as if ffmpeg begins eating memory when the CPU's are not giving enough power to it...

Wow, thank you so much for pointing out that, it's CPU that is causing the FFMPEG to use more RAM. It completely makes sense as well. Because if CPU is insufficient, FFMPEG just keeps putting frames in queue to process and uses buffer for the same and as it grows, so does the ram usage and finally it breaks when it reaches the max memory capacity.

I've been struggling to solve this issue for my setup. The video quality isn't a concern, so I tweaked the ffmpeg encoding options until I figured, that it doesn't hit above 95% CPU Usage consistently. My setup involves making this work properly with just 2 vCPU's as more cores increases cost in AWS significantly, on top of that my region doesn't have access to c5a instances.

I played around with many settings and noticed as below.

  1. Worked: Tried 720p HD recording with ultrafast preset for x264 which is lighter on CPU but file sizes are literally double of veryfast preset. Goes around 10 to 12 MB/ minute. This did help with getting stable recording but at very high cost on storage. If you have enough resources to maybe separately re-encode the MP4 files for smaller size then you can still benefit great quality.
  2. Partially Worked: The blur video background when a participant is using phone and the video is portrait mode appeared to be putting lot of pressure on FFMPEG and CPU. So, I disabled the blur video background for portrait mode participant.
  3. Not sufficient for 2 vCPU: Tried reducing frame rate from 30 to 24 for 1280x720 but it doesn't seem to help much with lowering cpu usage.
  4. Best: Added scaling video filter with resolution 854x480 which still has the 16:9 ratio so there is no image cropping. This helped the most, as even at full screen video the max cpu utilisation is between 65% to 85% which I think is very awesome. also I consistently noticed that memory usage doesnt even go above 750 MB for whole jibri instance.
  5. Important: Using Google Chrome 78 as it's much lighter on CPU and RAM usage compared to newer version. On Ubuntu 20.04, Google Chrome 78 and as well as any browser version less than the latest crashes, because of some changes in Ubuntu 20.04. So, sticking to Ubuntu 18.04 is a good idea at least for jibri recording.

I am able to do recording with this configuration on t3a.small aws instance which has just 2 vCPU and 2 GB RAM. I don't think video recording can go any cheaper than this while maintaining the 16:9 ratio. And if you need HD or maybe Full HD video recording, 4 vCPU's are required there is no way out of it, unless you go with ultrafast preset+lot of storage. Just putting a summary of above stuff for very stable setup of Jibri Recording while maintaining 16:9 aspect ratio.

  1. AWS Instance with just 2 vCPU and 2 GB RAM or any equivalent should do.
  2. OS: Ubuntu 18.04 x64
  3. Software: Google Chrome 78 + ChromeDriver 78 + JRE8
  4. Disable Video Background for Portrait Mode in Jitsi Meet Configuration
  5. Add scaling video filter in FFMPEG, to scale to 854x480

I didn't re-compile jar file to modify ffmpeg settings instead of that I modified the parameters on the fly by creating a ffmpeg script, which seems to work flawlessly.

I hope this helps for anyone looking for solution to this issue.

Let me know if you have queries.

pdarcos commented 4 years ago

Thanks for the detailed explanation @starkwiz Very helpful.

Do you mind sharing your ffmpeg script? I'd like to try out your suggestions on a 2 vCPU Hetzner instance.

starkwiz commented 4 years ago

Thanks for the detailed explanation @starkwiz Very helpful.

Do you mind sharing your ffmpeg script? I'd like to try out your suggestions on a 2 vCPU Hetzner instance.

/usr/local/bin/ffmpeg

#!/bin/bash
echo ffmpeg in $0 #Comment this line after making sure, that running ffmpeg, points to this script.
ARGS=$@
ARGS=$(echo $ARGS | sed 's/-tune zerolatency/-tune zerolatency -vf scale=854x480/') #Scale Video to 854x480
exec /usr/bin/ffmpeg $ARGS

Make sure to update permissions for ffmpeg as below. chmod 755 /usr/local/bin/ffmpeg

Restart jibri services, so that it picks up ffmpeg command from new location.

systemctl stop jibri
systemctl stop jibri-xorg
systemctl start jibri
pdarcos commented 4 years ago

Thanks @starkwiz Much appreciated!

igorayres-falepaco commented 4 years ago

I honestly don't know what else to do...

just start recording, the processing and memory consumption of my machine will go up !!

BEFORE image

AFTER image

Could someone help me? I can pay to fix this

ec-blaster commented 4 years ago

@pdarcos A lot of. I don't remember but something around 12GB. But really after increasing the CPU's number it only used about 2-3GB.

bbaldino commented 4 years ago

Nice work @starkwiz...I think that probably also explains why we haven't seen this issue but others have (since it comes down to the hardware being used to run Jibri). I'm sure there's also a parameter we could pass to limit that queue size, though that will likely trade off for other problems--still it could be interesting to experiment with.

agustinramirez commented 4 years ago

Our only real solution was to increase vCPU's number. Increasing from 4 to 8 CPU's was the fix we used last time. It seems as if ffmpeg begins eating memory when the CPU's are not giving enough power to it...

Wow, thank you so much for pointing out that, it's CPU that is causing the FFMPEG to use more RAM. It completely makes sense as well. Because if CPU is insufficient, FFMPEG just keeps putting frames in queue to process and uses buffer for the same and as it grows, so does the ram usage and finally it breaks when it reaches the max memory capacity.

I've been struggling to solve this issue for my setup. The video quality isn't a concern, so I tweaked the ffmpeg encoding options until I figured, that it doesn't hit above 95% CPU Usage consistently. My setup involves making this work properly with just 2 vCPU's as more cores increases cost in AWS significantly, on top of that my region doesn't have access to c5a instances.

I played around with many settings and noticed as below.

  1. Worked: Tried 720p HD recording with ultrafast preset for x264 which is lighter on CPU but file sizes are literally double of veryfast preset. Goes around 10 to 12 MB/ minute. This did help with getting stable recording but at very high cost on storage. If you have enough resources to maybe separately re-encode the MP4 files for smaller size then you can still benefit great quality.
  2. Partially Worked: The blur video background when a participant is using phone and the video is portrait mode appeared to be putting lot of pressure on FFMPEG and CPU. So, I disabled the blur video background for portrait mode participant.
  3. Not sufficient for 2 vCPU: Tried reducing frame rate from 30 to 24 for 1280x720 but it doesn't seem to help much with lowering cpu usage.
  4. Best: Added scaling video filter with resolution 854x480 which still has the 16:9 ratio so there is no image cropping. This helped the most, as even at full screen video the max cpu utilisation is between 65% to 85% which I think is very awesome. also I consistently noticed that memory usage doesnt even go above 750 MB for whole jibri instance.
  5. Important: Using Google Chrome 78 as it's much lighter on CPU and RAM usage compared to newer version. On Ubuntu 20.04, Google Chrome 78 and as well as any browser version less than the latest crashes, because of some changes in Ubuntu 20.04. So, sticking to Ubuntu 18.04 is a good idea at least for jibri recording.

I am able to do recording with this configuration on t3a.small aws instance which has just 2 vCPU and 2 GB RAM. I don't think video recording can go any cheaper than this while maintaining the 16:9 ratio. And if you need HD or maybe Full HD video recording, 4 vCPU's are required there is no way out of it, unless you go with ultrafast preset+lot of storage. Just putting a summary of above stuff for very stable setup of Jibri Recording while maintaining 16:9 aspect ratio.

  1. AWS Instance with just 2 vCPU and 2 GB RAM or any equivalent should do.
  2. OS: Ubuntu 18.04 x64
  3. Software: Google Chrome 78 + ChromeDriver 78 + JRE8
  4. Disable Video Background for Portrait Mode in Jitsi Meet Configuration
  5. Add scaling video filter in FFMPEG, to scale to 854x480

I didn't re-compile jar file to modify ffmpeg settings instead of that I modified the parameters on the fly by creating a ffmpeg script, which seems to work flawlessly.

I hope this helps for anyone looking for solution to this issue.

Let me know if you have queries.

@starkwiz how to disable what you mention in point number 2? The blur video background ?