AirenSoft / OvenMediaEngine

OvenMediaEngine (OME) is a Sub-Second Latency Live Streaming Server with Large-Scale and High-Definition. #WebRTC #LLHLS
https://OvenMediaEngine.com/ome
GNU Affero General Public License v3.0
2.51k stars 1.06k forks source link

High CPU usage #468

Closed fcqpl closed 2 years ago

fcqpl commented 3 years ago

Hello, I'm worried, this CPU usage is normal? OME has only 5 incoming streams and 1 outgoing. I'm afraid to use OME on production... Other solutions is using ~20% CPU max.

There is only one output profile with Hardware Acceleration enabled.

MobaXterm_KZwTsE7esF OME_config.md

fcqpl commented 3 years ago

And also 30% CPU on edge... which has only one incoming/outcoming WebRTC stream :/ MobaXterm_h89zMv4jfR

getroot commented 3 years ago

If you upload the ovenmediaengine.log file, I can check if the hardware encoder is working properly. What docker image did you use? And OPUS is not supported by hardware encoders. This is software encoded and costs more than you think.

In the case of Edge, CPU usage probably does not increase significantly even if the number of sessions increases. Please check while increasing the session.

getroot commented 3 years ago

Other solutions is using ~20% CPU max.

@fcqpl Is it true that other solutions use less than 20% of CPU to encode 5 streams in Full HD? (If you encode five streams with ffmpeg, you can see that they use a lot of CPU.)

Bansikov commented 3 years ago

Hello.

Thanks for your work. Its great. I don't want to create new issue. I have question about CPU usage too. I use currently mediasoup for WebRTC and for 1 incomming vga 640x480 connection its need 1% cpu, for outgoung <1% cpu. (usage from 1 core E5-1620 v2 @ 3.70GHz)

So, i will switch to ovenmediaengine and i have removed all Providers and Publishers from the defaut configuration, leaved just webrtc and bypass encodes. Should work without re-encoding or not? Configured with TCP Webrtc. But CPU usage for 1 incomming is 7-15% and for 1 outgoing 1-2%. Why its so much and can i optimize it more?

Logs as file after stream connection over https://demo.ovenplayer.com/demo_input.html Manual installed on debian 11.

logs.txt

fcqpl commented 3 years ago

Hi, Please check my results. This is on same machine, only other docker containers.

7 input RTMP streams, transcoding only to 1080p(!), 1 output HLS stream 40% CPU usage by OME process HLS playback getting errors:

Provider : setState() stalled
Provider.js:168 Provider : triggerSatatus stalled

https://user-images.githubusercontent.com/46460263/129461030-a8dd3b88-83ca-429e-8aa5-05c93257e4b7.png

OME has enabled only RTMP provider and HLS publisher. OutputProfile with audio bypass and video transcoded to 1080p by hw encoding.


7 input RTMP streams, transcoding to 1080p, 720p, 480p, 1 output HLS stream 70% CPU usage by OME process and huge RAM usage... 6GB...


7 input RTMP streams to nginx_rtmp, transcoding to 1080p, 720p and 480p (by ffmpeg), 1 output HLS stream (or SLDP stream) with nimble 25% CPU usage with: nginx_rtmp, ffmpeg with nvenc and aac enconding, nimble_streamer

fcqpl commented 3 years ago

In the case of Edge, CPU usage probably does not increase significantly even if the number of sessions increases. Please check while increasing the session.

15 clients on edge makes 65-90% CPU usage :/ 21 clients makes very very unstable playback for other users and min. 80% CPU usage 1 vCPU DigitalOcean droplet, OVT as input, WebRTC as output MobaXterm_syeacQtgf4

In my opinion, OME should use as little CPU power as possible to serve as many clients as possible on the weakest hardware server.

getroot commented 3 years ago

Maybe there is a problem with the latest patch. I will look into it and fix the problem.

fcqpl commented 3 years ago

Maybe there is a problem with the latest patch. I will look into it and fix the problem.

Which commit? On Origin i'm using self complied docker container - Dockerfile with nvenc. On Edge I'm using docker image v0.12.1.

getroot commented 3 years ago

I do not know yet, I'm doing various tests because 21 clients use almost 100% of the CPU is obviously a problem.

getroot commented 3 years ago

I first tested it on my development server. In order to remove other variables, Video/Audio was bypassed in Origin and received with WebRTC Input over TCP. Edge ran the default docker and tested with WebRTC Streaming over TCP with 32 players attached.

As captured, CPU and memory utilization are within the expected range. image

I'm going to test it out soon to see what happens on a low-performing instance. Is it possible that writing the logs to storage is expensive, or causing memory swap issues etc.? I would be very grateful if you could share any other information you know.

getroot commented 3 years ago

image

This is a capture of CPU/Memory usage from https://space.ovenplayer.com/. This is 6 inputs and 30 playbacks. It is running on AWS t3.nano instance, and it is 1 Origin. (No edge)

The settings are as follows.

                    <Application>
                    <Name>ovenspace</Name>
                    <!-- Application type (live/vod) -->
                    <Type>live</Type>
                    <OutputProfiles>
                        <OutputProfile>
                            <Name>bypass_stream</Name>
                            <OutputStreamName>${OriginStreamName}</OutputStreamName>
                            <Encodes>
                                <Video>
                                    <Bypass>true</Bypass>
                                </Video>
                                <Audio>
                                    <Bypass>true</Bypass>
                                </Audio>
                            </Encodes>
                        </OutputProfile>
                    </OutputProfiles>
                    <Providers>
                        <WebRTC />
                        <RTMP />
                    </Providers>
                    <Publishers>
                        <SessionLoadBalancingThreadCount>2</SessionLoadBalancingThreadCount>
                        <StreamLoadBalancingThreadCount>4</StreamLoadBalancingThreadCount>
                        <WebRTC>
                            <Timeout>30000</Timeout>
                            <Rtx>false</Rtx>
                            <Ulpfec>false</Ulpfec>
                        </WebRTC>
                    </Publishers>
                </Application>
getroot commented 3 years ago

In the case of Edge, CPU usage probably does not increase significantly even if the number of sessions increases. Please check while increasing the session.

15 clients on edge makes 65-90% CPU usage :/ 21 clients makes very very unstable playback for other users and min. 80% CPU usage 1 vCPU DigitalOcean droplet, OVT as input, WebRTC as output MobaXterm_syeacQtgf4

In my opinion, OME should use as little CPU power as possible to serve as many clients as possible on the weakest hardware server.

I think this is a very big problem. What is the difference between your environment and mine? I'm guessing it. If you have any information, please share.

getroot commented 3 years ago

@basisbit Could you please share your experience with this issue? I have not yet experienced excessive CPU usage. I know you recently streamed for thousands of viewers. Have you ever experienced excessive CPU usage?

getroot commented 3 years ago

@fcqpl I'm going to test it on a 1 vCPU DigitalOcean droplet this week.

fcqpl commented 3 years ago

Thanks for checking. My main origin server is dedicated server with Proxmox on 2x E5-2620. VM with OME has 16 cores.

I created CPU Optimized instance on DO. 2 CPU cores / 4 GB Memory / 25 GB Disk 17% CPU usage on two cores - 1 SRT stream incoming, 21 outgoing via WebRTC 39% CPU usage on two cores - 1 SRT + 5 RTMP incoming, 21 outgoing via WebRTC

Same docker-compose with only airensoft/ovenmediaengine:0.12.1 moved to cheaper DO instance with shared 1vCPU: 42% CPU usage - 1 SRT stream incoming, 21 outgoing via WebRTC ~90% CPU usage - 1 SRT + 5 RTMP incoming, 21 outgoing via WebRTC

Is it possible that writing the logs to storage is expensive, or causing memory swap issues etc.?

Propably not - this is on DO 1 GB / 1 CPU / 25 GB SSD Disk:

dd if=/dev/zero of=test bs=128M count=1 oflag=direct
1+0 records in
1+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 0.414418 s, 324 MB/s

free -m
              total        used        free      shared  buff/cache   available
Mem:            981         288         204           1         489         538
Swap:             0           0           0

dd on my main origin server:

134217728 bytes (134 MB, 128 MiB) copied, 0.322521 s, 416 MB/s

Config on this tests:

<Applications>
    <Application>
        <Name>app</Name>
        <Type>live</Type>
        <OutputProfiles>
            <HardwareAcceleration>true</HardwareAcceleration>
            <OutputProfile>
                <Name>bypass_stream</Name>
                <OutputStreamName>${OriginStreamName}</OutputStreamName>
                <Encodes>
                    <Video>
                        <Bypass>true</Bypass>
                    </Video>
                    <Audio>
                        <Codec>opus</Codec>
                        <Bitrate>192000</Bitrate>
                        <Samplerate>48000</Samplerate>
                        <Channel>2</Channel>
                    </Audio>
                </Encodes>
            </OutputProfile>
        </OutputProfiles>
        <Providers>
            <OVT />
            <RTMP />
            <SRT />
        </Providers>
        <Publishers>
            <StreamLoadBalancingThreadCount>1</StreamLoadBalancingThreadCount>
            <SessionLoadBalancingThreadCount>8</SessionLoadBalancingThreadCount>
            <OVT />
            <WebRTC>
                <Timeout>15000</Timeout>
                <Rtx>false</Rtx>
                <Ulpfec>false</Ulpfec>
            </WebRTC>
        </Publishers>
    </Application>
</Applications>
fcqpl commented 3 years ago

Here is logs from 1vCPU Origin on DO: OME_origindo2_logs.md

getroot commented 3 years ago

@fcqpl

I have a few questions to create the same environment.

  1. In the comment above (15 clients on edge makes 65-90% CPU usage //), you said you used OME as an edge. But looking at the settings and descriptions you commented this time, it seems to work as origin. (1 SRT, 5 RTMP inputs, opus encoding) Which one is correct? I am confused. If edge is used as it is, it receives the already encoded stream from origin through ovt and transmits it as it is. So there is no need to re-encode with opus.

  2. In your config, all incoming streams are re-encoded into opus. Does this mean that 6 incoming streams are re-encoded as OPUS, and when 21 players play, the CPU is about 90% used?

  3. In DO, 1vCPU seems to have 1GB or 2GB of memory. How much memory do you have when testing?

fcqpl commented 3 years ago
  1. Here -> https://github.com/AirenSoft/OvenMediaEngine/issues/468#issuecomment-899931219 is OME as edge connected to my main origin via OVT. There is no reencoding. Simple config as this - https://github.com/AirenSoft/OvenMediaEngine/blob/master/misc/conf_examples/Edge.xml with only added TLS config.

  2. Here -> https://github.com/AirenSoft/OvenMediaEngine/issues/468#issuecomment-900104205 is OME as origin

    Does this mean that 6 incoming streams are re-encoded as OPUS, and when 21 players play, the CPU is about 90% used?

Yes. But OPUS encoding isn't a problem... Now tested on DO shared 1vCPU: OME with video and audio bypass / 1 SRT and 5 RTMP input / 1x HLS output - still ~45% CPU usage - too much. MobaXterm_uuoBvGVghl

  1. 1 vCPU / 1 GB Memory / 25 GB Disk chrome_vB7CbJAvwk

We have a non-profit project where streaming where a delay of up to 4 seconds is required. So this is why i'm testing it on small cheap instances :)

getroot commented 3 years ago

I'll see if there's a way to optimize for a limited environment like DO's 1vCPU.

basisbit commented 3 years ago

I regularly use Digital Ocean 2 CPU instances and they work well most of the time for up to 150 concurrent WebRTC users @ 1Mb/s of video stream. These instances have shared CPU resources at low priority - sometimes / rarely they have less than 10% of usable CPU time if there are a couple of "dedicated" CPU heavy users on the same host. If you want to do reliable video streaming, you won't get around instances with dedicated CPU resources. However, on Digital ocean, you also share the network interfaces with others and can't buy dedicated network resources. Sometimes your upload speed is 100Mb/s and sometimes it is 1Gb/s. From my experience, you'll want to use a bunch of smaller Digital Ocean droplets so you are less likely to run into upload speed bottleneck. (noisy neighbor issue)

Last weekend I had an event with ~ 32000 unique attendees based on OME master from 13th of July. My fork can be found here: https://github.com/basisbit/OvenMediaEngine - it only has config/deployment changes, all code changes are already merged into official OME master. My origin config: https://github.com/basisbit/OvenMediaEngine/tree/master/local-data/origin_conf My edge config: https://github.com/basisbit/OvenMediaEngine/tree/master/local-data/edge_conf

I only use Digital ocean in Singapore, because for all other locations their traffic prices and performance are not competitive enough for me. For all other locations, I use virtual servers with 4 CPU threads of AMD Ryzen/Epyc 3rd gen CPUs, 2GB RAM and servers which reliably do at least 400Mb/s of upload and with good peerings to all major ISPs in their geographic region. If my use case needs video transcoding, then I order hourly billed dedicated bare metal servers with AMD Ryzen 3700X or better in the region of the streaming customers.

basisbit commented 3 years ago

I'll see if there's a way to optimize for a limited environment like DO's 1vCPU.

Imho there is no need for that, except for maybe improving documentation and logging warnings if OME was started with an output profile that has video transcoding enabled and the machine has less than 4 CPU threads.

WebRTC streaming requires roughly twice as much available CPU resources than HLS streaming and roughly one third more network upload speed, but I guess that is to be expected.

getroot commented 3 years ago

@basisbit Thank you for sharing your valuable experience and knowledge. Nevertheless, I will test various hypotheses that can be optimized as much as possible. (Of course, this will likely fail, but it will be worthwhile.)

MoZyo commented 3 years ago

I get these CPU usage spikes too, monitoring with htop (as shown above). It's best to use htop with -d 10 arg and press F2 for setup and get CPU Avg. This way I got more accurate information.

image Streaming to 25 viewers with 1 input and 5 encodes.

fcqpl commented 3 years ago

@getroot @basisbit High CPU usage is a huge problem, especially in embedded systems where higher CPU usage causes unnecessary load on the system and higher power consumption. We are currently using a nimble streamer to display realtime cameras - it converts stream from RTSP to SLDP for the browser that runs on the system. The CPU load with 3 cameras is 5%. I changed nimble to OME and the load was much greater.

Bansikov commented 3 years ago

Hello.

Thanks for your work. Its great. I don't want to create new issue. I have question about CPU usage too. I use currently mediasoup for WebRTC and for 1 incomming vga 640x480 connection its need 1% cpu, for outgoung <1% cpu. (usage from 1 core E5-1620 v2 @ 3.70GHz)

So, i will switch to ovenmediaengine and i have removed all Providers and Publishers from the defaut configuration, leaved just webrtc and bypass encodes. Should work without re-encoding or not? Configured with TCP Webrtc. But CPU usage for 1 incomming is 7-15% and for 1 outgoing 1-2%. Why its so much and can i optimize it more?

Logs as file after stream connection over https://demo.ovenplayer.com/demo_input.html Manual installed on debian 11.

logs.txt

@getroot What is about this? It's the same issue? And why when I stream webrtc with 640x480 and bitrate limit of 300kbps (with webrtc demo input)

  1. Its uses more as 500kbps of trafic
  2. !!! always periodicaly every second !!! video has verry bad quality for a moment, obvious problem of frames splice / junction or so... (with FullHd its hardly noticeable) What does OME every secund? Unfortunately, in this form, Webrtc Input is not suitable for use, I don’t know how others use it here ...
fcqpl commented 3 years ago

Hi @2002demon We have some video frame loss and audio issues. Sometimes every 30/60 seconds, sometimes not. And I see also this issues when starting/stopping stream on other path (https://youtu.be/00d3dQ7tIqo - audio here between 00:02-00:03). There is fiber between my PC and server with OME with no packetloss, webrtc tcp.

But I think it's topic for other issue.

basisbit commented 3 years ago

@all please stop using this issue to mention all of the things which remotely have anything to do with performance. This issue is only about what is mentioned in the issue start post.

@fcqpl the performance issue in your start post is most likely because of not properly configured hardware acceleration, so that it uses CPU based transcoding, as well as doing cpu-heavy transcoding where it is not necessary. A Digital Ocean cheapest 1 CPU instance with non-reliable available CPU performance is not usable for comparison tests, see explanation in previous posts here. Regarding any temporary CPU spikes when certain events happen, please open new issues instead of just reusing this here configuration related support issue.

@2002demon the additional network traffic for WebRTC is to be expected - it is not anything OvenMediaEngine specific, but instead is normal because WebRTC uses UDP where packet loss has to be handled by the application layer and thus ULP FEC, RTX and similar standards are used to make the video stream more resilient against loss of single packets. Also, is 300Kbps your video bitrate or audio + video (and what about the configured opus transcoded audio bitrate)? In general, video encoding is never truely CBR, but instead it always varies a bit and the video encoder tries to on average hit your target video bitrate.

!!! always periodicaly every second !!! video has verry bad quality for a moment

That sounds like a problem of your video encoder, check your OBS / ffmpeg settings.

Bansikov commented 3 years ago

@basisbit

  1. About every 1 second periodicaly lost of quality. I stream over browser with https://demo.ovenplayer.com/demo_input.html and OME configuration has only webrtc and obly bypass (without encoding) To test this issue just select 640x480 and 300kbps limitation on demo input. I have tested on my server and used for testiong this server wss://spaceome.airensoft.com:3333, is the same.
  2. About CPU usage as I wrote above with the same quality and limitations on mediasoup with 1 Incoming Webrtc I have just 1% CPU and by OME 10% average
  3. About trafic is not so important, but strange.
fcqpl commented 3 years ago

@basisbit My origin server (screenshot in first comment on this issue - https://github.com/AirenSoft/OvenMediaEngine/issues/468#issue-969756789) - has 2x NVIDIA GPU to hardware transcoding and OME is using it. nvidia-smi dmon is showing usage by OME. CPU usage is too much even with disabled Opus encoding.

root@oven:~# docker logs streaming_ovenmediaengine_1 | grep NVIDIA
[08-15 15:08:22.971] I [OvenMediaEngine:1] Transcoder | transcoder_gpu.cpp:47   | Supported NVIDIA CUDA hardware accelerator

My edge server (https://github.com/AirenSoft/OvenMediaEngine/issues/468#issuecomment-898046720) not using transcoding at all - it only moving packets from origin to client. There is too much CPU usage.

Here it's next edge server - https://github.com/AirenSoft/OvenMediaEngine/issues/468#issuecomment-899931219 - also without any transcoding - 86% CPU with only forwarding packets from origin.

My test origin server on DO (https://github.com/AirenSoft/OvenMediaEngine/issues/468#issuecomment-900104205) was created to compare if maybe it's problem with my hardware - but not.

There is something wrong with OME and CPU usage... comparing to other solutions.

basisbit commented 3 years ago

Please try if you can reproduce the performance issues with current master instead of using a release, in case you are not already using master. Also, please add your full origin and edge configuration files. (Just replace real domain names and passwords / api keys by sample text)

getroot commented 3 years ago

@basisbit

  1. About every 1 second periodicaly lost of quality. I stream over browser with https://demo.ovenplayer.com/demo_input.html and OME configuration has only webrtc and obly bypass (without encoding) To test this issue just select 640x480 and 300kbps limitation on demo input. I have tested on my server and used for testiong this server wss://spaceome.airensoft.com:3333, is the same.
  2. About CPU usage as I wrote above with the same quality and limitations on mediasoup with 1 Incoming Webrtc I have just 1% CPU and by OME 10% average
  3. About trafic is not so important, but strange.

@2002demon Please create a new issue about the quality of WebRTC Input.

getroot commented 3 years ago

@fcqpl Today, I provided a broadcast platform for an online conference using OME (the latest master branch), and it has been successfully completed.

I used an AWS instance with 4 core CPU and 32M memory. And for 8 hours there were 2 RTMP inputs and an average of 250 viewers. The input video stream was 2Mbps/1080p, transcoding was bypassed, and the audio was OPUS encoded.

CPU used 80-110% on average, and memory used about 5%.

I haven't observed any excessive CPU usage like your environment yet. I'll try some more testing. One possibility is that if OME fails to send a packet, it will keep retrying it in a separate thread. If this is excessive, I think it can increase CPU usage. Have you tried tuning the network settings in the kernel of linux as shown in the URL below? https://fasterdata.es.net/host-tuning/linux/

fcqpl commented 3 years ago

My main origin server (https://github.com/AirenSoft/OvenMediaEngine/issues/468#issue-969756789). 16 cores from E5-2620, 8GB RAM Nvidia GPU for encoding Docker enviroment, OME compiled with https://github.com/AirenSoft/OvenMediaEngine/blob/master/Dockerfile.nv

OME is using ~800-950% CPU (~58% of all cores). Video encoding is enabled, one 1080p, "Supported NVIDIA CUDA hardware accelerator". Audio encoding is enabled, one OPUS 128k. It's too much for incoming 5 RTMP streams! Disabling WebRTC (opus encoding also) and enabling HLS has no noticeable changes on CPU usage (~700% CPU usage).

Here is config files and logs: OME_origin_main_logs.md OME_origin_main_config.md

htop when there is only one WebRTC connected client: MobaXterm_Chsm0qREZc

htop with disabled WebRTC, enabled HLS and disabled opus transcoding: MobaXterm_vz2IQ4BS99

<Encodes>   
    <Audio>
        <Bypass>true</Bypass>
    </Audio>
    <Video>
        <Codec>h264</Codec>
        <Width>1920</Width>
        <Height>1080</Height>
        <Bitrate>2500000</Bitrate>
        <Framerate>30</Framerate>
    </Video>
</Encodes>

OME is using my GPU:

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     47654      C   ...ngine/bin/OvenMediaEngine     1308MiB |
+-----------------------------------------------------------------------------+

# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0    81    39     -    16     1    13    14  5005  1898
alnux commented 3 years ago

Hi, maybe My problem is the same, becouse after an hour of usage the client side ( ovenplayer) turns slow and appears lags or frezze the video. My server is a Ubuntu 18, 6cores and 16RAM, 400M bandwith. What thing could be??? To stream i'm using vmix but this happen too with obs and streamlabs

donio4 commented 3 years ago

I also migrate from nimble streamer and I have High CPU usage. Nimble streamer provides has very lower cpu usage with same clients connected

Keukhan commented 3 years ago

@fcqpl

Thanks for reporting.

I want to see which threads are using the most CPU.

Can you capture and send the CPU usage that the thread is using with the command below?

top -H -p [PID]

It will be very helpful in analyzing the cause.

fcqpl commented 3 years ago

This is from OME with enabled only HLS output. image

fcqpl commented 3 years ago

Ouch... there is software frame scaller? nvenc has hardware video scaling: https://docs.nvidia.com/video-technologies/video-codec-sdk/ffmpeg-with-nvidia-gpu/#hwaccel-transcode-with-scaling

Keukhan commented 3 years ago

@fcqpl

Thank you for your quick reply. :)

As expected, the scaler thread uses a lot. Unfortunately, we are currently using the S/W Scaler. There's a problem with high CPU utilization due to low CPU clock. The hardware codec is still under development. I need to add H/W Scaler as the next plan.

Please wait a few days.

I'll notify you when it's updated.

0xARROWK commented 3 years ago

Hello ! I have the same issue with high CPU usage. I have followed all suggestions mentionned in this thread and I have tried to reduce configuration as much as possible. My server have 4 vcores and the CPU is used at ~230%

I have only one stream input and one stream output, the stream is reencoded 2 times with this config :

                                                <OutputProfile>
                                                        <Name>HD</Name>
                                                        <OutputStreamName>${OriginStreamName}_hd</OutputStreamName>
                                                        <Encodes>
                                                                <Video>
                                                                        <Codec>h264</Codec>
                                                                        <Width>1280</Width>
                                                                        <Height>720</Height>
                                                                        <Bitrate>2000000</Bitrate>
                                                                        <Framerate>30.0</Framerate>
                                                                </Video>
                                                                <Audio>
                                                                        <Codec>opus</Codec>
                                                                        <Bitrate>128000</Bitrate>
                                                                        <Samplerate>48000</Samplerate>
                                                                        <Channel>2</Channel>
                                                                </Audio>
                                                        </Encodes>
                                                </OutputProfile>
                                                <OutputProfile>
                                                        <Name>SD</Name>
                                                        <OutputStreamName>${OriginStreamName}_sd</OutputStreamName>
                                                        <Encodes>
                                                                <Video>
                                                                        <Codec>h264</Codec>
                                                                        <Width>720</Width>
                                                                        <Height>480</Height>
                                                                        <Bitrate>1500000</Bitrate>
                                                                        <Framerate>24.0</Framerate>
                                                                </Video>
                                                                <Audio>
                                                                        <Codec>opus</Codec>
                                                                        <Bitrate>128000</Bitrate>
                                                                        <Samplerate>48000</Samplerate>
                                                                        <Channel>2</Channel>
                                                                </Audio>
                                                        </Encodes>
                                                </OutputProfile>
                                        </OutputProfiles>

But in my case, it's not the rescaler that take a lot of CPU :

image

If you need more information for debug or if you have any question and I can help don't hesitate.

fcqpl commented 3 years ago

@0xARK Are you using hardware transcoding?

0xARROWK commented 3 years ago

I have uncommented the harware transcoding to try it, but I don't know if my vps have a GPU or if OME is using it. I have edited my post, you can also see the monitoring of the pid process of OME

basisbit commented 3 years ago

@0xARK if you didn't pay extra for a GPU accelerated virtual machine (they are much more expensive), then you don't have graphic acceleration. That CPU usage is to be expected when CPU based video transcoding is enabled.

0xARROWK commented 3 years ago

Okay, in this case no, I don't have gpu acceleration

Keukhan commented 3 years ago

@0xARK

In which cloud are you creating and using an instance?

In case of using SW decoder/encoder, performance difference occurs depending on the presence or absence of specific instructions of the CPU.

On July 21st, the x264 library was changed to support asm. If the source of the OvenMediaEngine you are using is an outdated version, CPU usage may be high for decoders and encoders.

https://github.com/AirenSoft/OvenMediaEngine/commit/461ddd481a2b0940df5f71aa1c091473bf400c9e

Please test again with the latest version.

Thanks.

0xARROWK commented 3 years ago

Hello @Keukhan,

I have installed OME on the first vps offer at this page : contabo. It run on an Ubuntu 20.03.

For the version of OME used, I think I use the last version (v0.12.1).

But as @basisbit maybe it's normal for transcoding and serve only one stream. I just thought it wasn't.

Cordially,

Matéo

Keukhan commented 3 years ago

@0xARK

Unfortunately, the s/w based x264 decoding/encoding performance optimization option is not applied in v0.12.1.

Please check the performance with the latest version of the master branch.

A new version will be released in the near future once important tasks in progress are completed.

Thanks

getroot commented 3 years ago

I've been doing some research on high CPU usage since last week, and I'm sharing my interim results.

  1. Excessive retries when socket send fails

OME has logic to retry when socket send fails. This was over-run and was using a lot of CPU. I modified this part and committed it to the latest master . (And we plan to improve the algorithm in this part a bit more.)

  1. On low core VMs, the default setting <SessionLoadBalancingThreadCount>8</SessionLoadBalancingThreadCount> is CPU intensive.

Lowering <SessionLoadBalancingThreadCount> to 1 can lower CPU usage.

  1. When there are multiple ICE candidates, multiple threads are created for distributed processing, which uses more CPU. Therefore, you can lower CPU usage by setting only one Ice candidate as follows.

<IceCandidate>${env:OME_ICE_CANDIDATES:*:10006/udp}</IceCandidate>

  1. There is a spike in cpu usage only in DO VMs. (This is not observed on my test server.)

I did 1 rtmp input, opus encoding, 10 webrtc playback and saw 8% cpu usage. But sometimes I've seen the cpu jump up to 20% or more instantaneously. I will find out more about this.

  1. My Environment

I used DO's 1 core, 1G mem VM and tested it with airensoft/ovenmediaengine:dev. I tested 1 RTMP input and 10 WebRTC/tcp outputs. I've seen CPU usage between 8% and 10%. The settings are as follows:

<?xml version="1.0" encoding="UTF-8" ?>

<Server version="8">
        <Name>OvenMediaEngine</Name>
        <Type>origin</Type>
        <IP>*</IP>
        <StunServer>stun.l.google.com:19302</StunServer>

        <Bind>
                <Providers>
                        <RTMP>
                                <Port>${env:OME_RTMP_PROV_PORT:1935}</Port>
                        </RTMP>
                </Providers>

                <Publishers>
                        <WebRTC>
                                <Signalling>
                                        <Port>${env:OME_SIGNALLING_PORT:3333}</Port>
                                </Signalling>
                                <IceCandidates>
                                        <TcpRelay>${env:OME_TCP_RELAY_ADDRESS:*:3478}</TcpRelay>
                                        <IceCandidate>${env:OME_ICE_CANDIDATES:*:10006/udp}</IceCandidate>
                                </IceCandidates>
                        </WebRTC>
                </Publishers>
        </Bind>

        <VirtualHosts>
                <VirtualHost include="VHost*.xml" />
                <VirtualHost>
                        <Name>default</Name>
                        <Host>
                                <Names>
                                        <Name>*</Name>
                                </Names>
                        </Host>
                        <Applications>
                                <Application>
                                        <Name>app</Name>
                                        <Type>live</Type>
                                        <OutputProfiles>
                                                <OutputProfile>
                                                        <Name>bypass_stream</Name>
                                                        <OutputStreamName>${OriginStreamName}</OutputStreamName>
                                                        <Encodes>
                                                                <Audio>
                                                                        <Bypass>true</Bypass>
                                                                </Audio>
                                                                <Video>
                                                                        <Bypass>true</Bypass>
                                                                </Video>
                                                                <Audio>
                                                                        <Codec>opus</Codec>
                                                                        <Bitrate>128000</Bitrate>
                                                                        <Samplerate>48000</Samplerate>
                                                                        <Channel>2</Channel>
                                                                </Audio>
                                                        </Encodes>
                                                </OutputProfile>
                                        </OutputProfiles>
                                        <Providers>
                                                <RTMP />
                                        </Providers>
                                        <Publishers>
                                                <StreamLoadBalancingThreadCount>1</StreamLoadBalancingThreadCount>
                                                <SessionLoadBalancingThreadCount>1</SessionLoadBalancingThreadCount>
                                                <WebRTC>
                                                        <Timeout>30000</Timeout>
                                                        <Rtx>false</Rtx>
                                                        <Ulpfec>false</Ulpfec>
                                                </WebRTC>
                                        </Publishers>
                                </Application>
                        </Applications>
                </VirtualHost>
        </VirtualHosts>
</Server>
llspalex commented 3 years ago

@getroot Thank you for the information. What bitrate was your RTMP stream and did you use DO's 1 core VM with 1GB or 2GB memory?

getroot commented 3 years ago

@llspalex I used DO's 1 core VM with 1GB memory