AirenSoft / OvenMediaEngine

OvenMediaEngine (OME) is a Sub-Second Latency Live Streaming Server with Large-Scale and High-Definition. #WebRTC #LLHLS
https://airensoft.com/ome.html
GNU Affero General Public License v3.0
2.58k stars 1.06k forks source link

exceeded the threshold #819

Closed dva-re closed 2 years ago

dva-re commented 2 years ago

Please help. Any possible reasons why?

Streaming takes place over the local network (SRT). Further, several edges are used in the global network, from which it is given to the ovenplayer

ome_1  | [2022-07-14 10:55:47.835] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 36421, threshold: 500, peak: 36421
ome_1  | [2022-07-14 10:55:48.695] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 36503, threshold: 100, peak: 36503
ome_1  | [2022-07-14 10:55:52.835] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 36904, threshold: 500, peak: 36904
ome_1  | [2022-07-14 10:55:53.715] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 36991, threshold: 100, peak: 36991
ome_1  | [2022-07-14 10:55:57.855] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 37392, threshold: 500, peak: 37392
ome_1  | [2022-07-14 10:55:58.715] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 37476, threshold: 100, peak: 37476
ome_1  | [2022-07-14 10:56:02.855] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 37876, threshold: 500, peak: 37876
ome_1  | [2022-07-14 10:56:03.735] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 37960, threshold: 100, peak: 37960
ome_1  | [2022-07-14 10:56:07.875] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 38361, threshold: 500, peak: 38361
ome_1  | [2022-07-14 10:56:08.735] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 38446, threshold: 100, peak: 38446
ome_1  | [2022-07-14 10:56:12.875] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 38847, threshold: 500, peak: 38847
ome_1  | [2022-07-14 10:56:13.735] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 38931, threshold: 100, peak: 38931
ome_1  | [2022-07-14 10:56:17.875] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 39331, threshold: 500, peak: 39331
ome_1  | [2022-07-14 10:56:18.755] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 39415, threshold: 100, peak: 39415
ome_1  | [2022-07-14 10:56:22.875] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 39816, threshold: 500, peak: 39816
ome_1  | [2022-07-14 10:56:23.755] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 39902, threshold: 100, peak: 39902
ome_1  | [2022-07-14 10:56:27.895] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 40302, threshold: 500, peak: 40302
ome_1  | [2022-07-14 10:56:28.775] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 40387, threshold: 100, peak: 40387
ome_1  | [2022-07-14 10:56:32.895] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 40786, threshold: 500, peak: 40786
ome_1  | [2022-07-14 10:56:33.795] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 40872, threshold: 100, peak: 40872
ome_1  | [2022-07-14 10:56:37.915] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 41273, threshold: 500, peak: 41273
ome_1  | [2022-07-14 10:56:38.795] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 41358, threshold: 100, peak: 41358
ome_1  | [2022-07-14 10:56:42.935] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 41759, threshold: 500, peak: 41759
ome_1  | [2022-07-14 10:56:43.815] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 41844, threshold: 100, peak: 41844
ome_1  | [2022-07-14 10:56:47.935] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x55807e03a0a0] #default#app - Mediarouter inbound indicator (5/8) size has exceeded the threshold: queue: 42243, threshold: 500, peak: 42243
ome_1  | [2022-07-14 10:56:48.835] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7f4fcc014b68] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 42329, threshold: 100, peak: 42329

early today

ome_1  | [2022-07-14 10:36:48.943] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 157818, threshold: 500, peak: 157818
ome_1  | [2022-07-14 10:36:53.944] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 158845, threshold: 500, peak: 158845
ome_1  | [2022-07-14 10:36:58.954] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 159872, threshold: 500, peak: 159872
ome_1  | [2022-07-14 10:37:03.958] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 160900, threshold: 500, peak: 160900
ome_1  | [2022-07-14 10:37:08.962] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 161926, threshold: 500, peak: 161926
ome_1  | [2022-07-14 10:37:13.976] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 162948, threshold: 500, peak: 162948
ome_1  | [2022-07-14 10:37:18.982] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 163958, threshold: 500, peak: 163958
ome_1  | [2022-07-14 10:37:24.016] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 164986, threshold: 500, peak: 164986
ome_1  | [2022-07-14 10:37:29.031] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 166022, threshold: 500, peak: 166022
ome_1  | [2022-07-14 10:37:34.037] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 167051, threshold: 500, peak: 167051
ome_1  | [2022-07-14 10:37:39.039] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 168081, threshold: 500, peak: 168081
ome_1  | [2022-07-14 10:37:44.041] W [AW-OVT0:58] ov.Queue | queue.h:268  | [0x7f4e6000f028] OVTPublisher Application/#default#app/second_stream StreamWorker Queue size has exceeded the threshold: queue: 169109, threshold: 500, peak: 169109
getroot commented 2 years ago

This may be due to poor performance. Check the CPU usage for each thread. https://airensoft.gitbook.io/ovenmediaengine/performance-tuning#performance-tuning

dva-re commented 2 years ago

image

one stream, few viewers

getroot commented 2 years ago

Dech264 is an H.264 decoder. The decoder seems to be using too much CPU than normal. Is the server receiving a very high bitrate stream?

And what version of OME are you using? The performance of MediaRouter has been improved in the latest version.

If you don't have many viewers, a throttling by increasing the AppWorkerCount value to around 2-4 and lowering the StreamWorker to around 8 might help.

dva-re commented 2 years ago

image: airensoft/ovenmediaengine:0.14.3

and I didn't observe this issue on 0.14.2 before

I used to create a separate OutputProfile for each quality, but in the new version I switched to ABR, thats all changes from my side.

current config was

                    <AppWorkerCount>8</AppWorkerCount>
                    <StreamWorkerCount>8</StreamWorkerCount>

now will try

                    <AppWorkerCount>4</AppWorkerCount>
                    <StreamWorkerCount>8</StreamWorkerCount>
getroot commented 2 years ago

ABR does not affect encoding performance. Only Playlist has been added. Please upload your entire Server.xml. Is it the same as before, except for Playlist?

getroot commented 2 years ago

And the log you posted in your first question tells us that you have multiple streams on your server. Isn't performance lacking when there are multiple stream inputs? The per-thread cpu usage you captured is when a single stream input comes in. Does the same problem occur when 1 stream input comes in? The captured CPU usage doesn't seem to be a problem now. It would be helpful to capture per-thread cpu usage when the problem occurs.

dva-re commented 2 years ago
<?xml version="1.0" encoding="UTF-8"?>

<Server version="8">
    <Name>OvenMediaEngine</Name>
    <Type>origin</Type>
    <IP>*</IP>

    <StunServer>stun.l.google.com:19302</StunServer>

    <Bind>
        <Managers>
            <API>
                <TLSPort>8081</TLSPort>
                <WorkerCount>2</WorkerCount>
            </API>
        </Managers>

        <Providers>
            <RTMP>
                <Port>1935</Port>
                <WorkerCount>5</WorkerCount>
            </RTMP>
            <SRT>
                <Port>9999</Port>
                <WorkerCount>5</WorkerCount>
            </SRT>

            <WebRTC>
                <Signalling>
                    <TLSPort>3333</TLSPort>
                    <WorkerCount>2</WorkerCount>
                </Signalling>

                <IceCandidates>
                    <TcpRelay>*:3480</TcpRelay>
                    <TcpForce>false</TcpForce>
                    <IceCandidate>*:10006-10010/udp</IceCandidate>
                    <TcpRelayWorkerCount>1</TcpRelayWorkerCount>
                </IceCandidates>
            </WebRTC>
        </Providers>

        <Publishers>
            <OVT>
                <Port>9000</Port>
                <WorkerCount>2</WorkerCount>
            </OVT>
            <WebRTC>
                <Signalling>
                    <TLSPort>3333</TLSPort>
                    <WorkerCount>1</WorkerCount>
                </Signalling>
                <IceCandidates>
                    <TcpRelay>*:3480</TcpRelay>
                    <TcpForce>false</TcpForce>
                    <!-- in production in this place real IP -->
                    <IceCandidate>1.2.3.4:10010/udp</IceCandidate>
                    <TcpRelayWorkerCount>1</TcpRelayWorkerCount>
                </IceCandidates>
            </WebRTC>
        </Publishers>
    </Bind>

    <Managers>
        <Host>

            <Names>
                <Name>origin-server.domain.name</Name>
            </Names>
            <TLS>
                <CertPath>/var/certs/domain.name/chain.pem</CertPath>
                <KeyPath>/var/certs/domain.name/privkey.pem</KeyPath>
                <ChainCertPath>/var/certs/domain.name/fullchain.pem</ChainCertPath>
            </TLS>

        </Host>
        <API>
            <AccessToken>abcd-real-token-instead</AccessToken>
        </API>
    </Managers>

    <VirtualHosts>
        <VirtualHost include="VHost*.xml"/>
        <VirtualHost>
            <Name>default</Name>
            <Distribution>domain.name</Distribution>

            <!-- Settings for multi ip/domain and TLS -->
            <Host>
                <Names>
                    <Name>origin-server.domain.name</Name>
                    <Name>192.168.10.2</Name>
                </Names>

                <TLS>
                    <CertPath>/var/certs/domain.name/chain.pem</CertPath>
                    <KeyPath>/var/certs/domain.name/privkey.pem</KeyPath>
                    <ChainCertPath>/var/certs/domain.name/fullchain.pem</ChainCertPath>
                </TLS>
            </Host>

            <!-- Refer https://airensoft.gitbook.io/ovenmediaengine/signedpolicy -->
            <SignedPolicy>
                <PolicyQueryKeyName>pol</PolicyQueryKeyName>
                <SignatureQueryKeyName>sig</SignatureQueryKeyName>
                <SecretKey>SignRealSecretInstead</SecretKey>

                <Enables>
                    <Providers>rtmp,webrtc,srt</Providers>
                    <Publishers>webrtc</Publishers>
                </Enables>
            </SignedPolicy>

            <!-- Settings for applications -->
            <Applications>
                <Application>
                    <Name>app</Name>
                    <!-- Application type (live/vod) -->
                    <Type>live</Type>
                    <OutputProfiles>

                        <OutputProfile>
                            <Name>abr</Name>
                            <OutputStreamName>${OriginStreamName}</OutputStreamName>
                            <Playlist>
                                <Name>for Webrtc</Name>
                                <FileName>abr</FileName>
                                <Options>
                                    <WebRtcAutoAbr>true</WebRtcAutoAbr>
                                </Options>
                                <Rendition>
                                    <Name>SD</Name>
                                    <Video>480p</Video>
                                    <Audio>opus</Audio>
                                </Rendition>
                                <Rendition>
                                    <Name>HD</Name>
                                    <Video>720p</Video>
                                    <Audio>opus</Audio>
                                </Rendition>
                                <Rendition>
                                    <Name>FHD</Name>
                                    <Video>1080p</Video>
                                    <Audio>opus</Audio>
                                </Rendition>
                            </Playlist>
                            <Playlist>
                                <Name>For bypass webrtc</Name>
                                <FileName>bp</FileName>
                                <Options>
                                    <WebRtcAutoAbr>true</WebRtcAutoAbr>
                                </Options>
                                <Rendition>
                                    <Name>FHD</Name>
                                    <Video>original</Video>
                                    <Audio>opus</Audio>
                                </Rendition>
                            </Playlist>

                            <Encodes>
                                <Video>
                                    <Name>480p</Name>
                                    <Codec>h264</Codec>
                                    <Width>854</Width>
                                    <Height>480</Height>
                                    <Bitrate>1200000</Bitrate>
                                    <Framerate>60</Framerate>
                                    <Preset>medium</Preset>
                                </Video>

                                <Video>
                                    <Name>720p</Name>
                                    <Codec>h264</Codec>
                                    <Width>1280</Width>
                                    <Height>720</Height>
                                    <Bitrate>2400000</Bitrate>
                                    <Framerate>60</Framerate>
                                    <Preset>medium</Preset>
                                </Video>

                                <Video>
                                    <Name>1080p</Name>
                                    <Codec>h264</Codec>
                                    <Width>1920</Width>
                                    <Height>1080</Height>
                                    <Bitrate>3000000</Bitrate>
                                    <Framerate>60</Framerate>
                                    <Preset>medium</Preset>
                                </Video>

                                <Video>
                                    <Name>original</Name>
                                    <Bypass>true</Bypass>
                                </Video>

                                <Audio>
                                    <Name>opus</Name>
                                    <Codec>opus</Codec>
                                    <Bitrate>128000</Bitrate>
                                    <Samplerate>48000</Samplerate>
                                    <Channel>2</Channel>
                                </Audio>

                            </Encodes>

                        </OutputProfile>

                        <!--                        <OutputProfile>
                                                    <Name>bypass_stream</Name>
                                                    <OutputStreamName>${OriginStreamName}</OutputStreamName>
                                                    <Encodes>
                                                        <Audio>
                                                            <Bypass>true</Bypass>
                                                        </Audio>
                                                        <Video>
                                                            <Bypass>true</Bypass>
                                                        </Video>
                                                        <Audio>
                                                            <Codec>opus</Codec>
                                                            <Bitrate>128000</Bitrate>
                                                            <Samplerate>48000</Samplerate>
                                                            <Channel>2</Channel>
                                                        </Audio>
                                                    </Encodes>
                                                </OutputProfile>

                                                <OutputProfile>
                                                    <Name>720p</Name>
                                                    <OutputStreamName>${OriginStreamName}_720p</OutputStreamName>
                                                    <Encodes>
                                                        <Audio>
                                                            <Bypass>true</Bypass>
                                                        </Audio>
                                                        <Video>
                                                            <Codec>h264</Codec>
                                                            <Width>1280</Width>
                                                            <Height>720</Height>
                                                            <Bitrate>1800000</Bitrate>
                                                            <Framerate>30.0</Framerate>
                                                        </Video>
                                                        <Audio>
                                                            <Codec>opus</Codec>
                                                            <Bitrate>128000</Bitrate>
                                                            <Samplerate>48000</Samplerate>
                                                            <Channel>2</Channel>
                                                        </Audio>
                                                    </Encodes>
                                                </OutputProfile>

                                                <OutputProfile>
                                                    <Name>480p</Name>
                                                    <OutputStreamName>${OriginStreamName}_480p</OutputStreamName>
                                                    <Encodes>
                                                        <Audio>
                                                            <Bypass>true</Bypass>
                                                        </Audio>
                                                        <Video>
                                                            <Codec>h264</Codec>
                                                            <Width>854</Width>
                                                            <Height>480</Height>
                                                            <Bitrate>1000000</Bitrate>
                                                            <Framerate>30.0</Framerate>
                                                        </Video>
                                                        <Audio>
                                                            <Codec>opus</Codec>
                                                            <Bitrate>128000</Bitrate>
                                                            <Samplerate>48000</Samplerate>
                                                            <Channel>2</Channel>
                                                        </Audio>
                                                    </Encodes>
                                                </OutputProfile>-->

                        <OutputProfile>
                            <Name>record</Name>
                            <OutputStreamName>${OriginStreamName}_record</OutputStreamName>
                            <Encodes>
                                <!--<Audio>
                                    <Bypass>true</Bypass>
                                </Audio>-->
                                <Video>
                                    <Codec>h264</Codec>
                                    <Width>1280</Width>
                                    <Height>720</Height>
                                    <Bitrate>1200000</Bitrate>
                                    <Framerate>14.0</Framerate>
                                </Video>
                            </Encodes>
                        </OutputProfile>

                    </OutputProfiles>

                    <Providers>
                        <OVT/>
                        <WebRTC/>
                        <RTMP/>
                        <SRT/>
                        <WebRTC>
                            <Timeout>30000</Timeout>
                        </WebRTC>
                    </Providers>
                    <Publishers>
                        <AppWorkerCount>8</AppWorkerCount>
                        <StreamWorkerCount>8</StreamWorkerCount>
                        <OVT/>
                        <WebRTC>
                            <Timeout>30000</Timeout>
                            <Rtx>false</Rtx>
                            <Ulpfec>false</Ulpfec>
                            <JitterBuffer>false</JitterBuffer>
                        </WebRTC>
                        <FILE>
                            <RootPath>/records</RootPath>
                            <FilePath>
                                /${VirtualHost}/${Application}/${Stream}/${StartTime:YYYYMMDDhhmmss}_${EndTime:YYYYMMDDhhmmss}.ts
                            </FilePath>
                            <InfoPath>/record.log.xml</InfoPath>
                        </FILE>
                    </Publishers>
                </Application>
            </Applications>
        </VirtualHost>
    </VirtualHosts>
</Server>
dva-re commented 2 years ago

Problem come back after 4 days without it. AppWorkerCount is 4, StreamWorkerCount is 8

ome_1  | [2022-07-18 11:51:16.214] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x5579be845890] #default#app - Mediarouter inbound indicator (1/4) size has exceeded the threshold: queue: 78170, threshold: 500, peak: 78170
ome_1  | [2022-07-18 11:51:17.314] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7fd190015f98] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 78275, threshold: 100, peak: 78275
ome_1  | [2022-07-18 11:51:21.234] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x5579be845890] #default#app - Mediarouter inbound indicator (1/4) size has exceeded the threshold: queue: 78656, threshold: 500, peak: 78656
ome_1  | [2022-07-18 11:51:22.314] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x7fd190015f98] #default#app/main1_stream-MR-Inbound size has exceeded the threshold: queue: 78761, threshold: 100, peak: 78761
ome_1  | [2022-07-18 11:51:26.254] W [SPRTMP-T1935:30] ov.Queue | queue.h:268  | [0x5579be845890] #default#app - Mediarouter inbound indicator (1/4) size has exceeded the threshold: queue: 79141, threshold: 500, peak: 79141

image

image

dva-re commented 2 years ago

Now I cannot start streams at all. Before fail - it was only 2 stream (all trough local network), one from Atem mini pro via RTMP and second from webcam (OvenLiveKit) via ws/WebRTC

and very few viewers, all connected to one of 4 edge-servers

@getroot please help

getroot commented 2 years ago

Hmm... this feels like it's holding a thread somewhere. There are still not enough clues to analyze the exact cause.

First of all, what does it mean that the stream cannot be started? Stream creation from all three RTMP, WebRTC, SRT fails? Does that mean ome is completely stopped? Please test this and let me know the results.

And please upload the log files of the last 4 days.

dva-re commented 2 years ago

By ws - there was no response from the server when trying to start broadcasting (aborted by timeout). I did not have time to try the other options for broadcasting, I had to correct the situation and restart the server. So far, we are treating this with a docker compose down and up again.

I saved the log. Please, just in case, in order not to disclose sensitive data, can I send you the log files somewhere (matrix, email, etc.) in a private message?

Thank you.

getroot commented 2 years ago

please send log files to support@airensoft.com. Thank you!

dva-re commented 2 years ago

send log files to support@airensoft.com.

Sent. Thank you very much in advance.

getroot commented 2 years ago

I have received it well. We will comment on the results after analysis. thank you

dva-re commented 2 years ago

@getroot Also, I found a saved log from July 14, when fails started. It seems to show why it is not possible to create a new stream (Reject stream creation) or something else that will help find the answer. Sent in a separate email.

getroot commented 2 years ago

There is one more piece of information I need. Is your Edge running the same version as Origin?

dva-re commented 2 years ago

Is your Edge running the same version as Origin?

Yes, there are all on version 0.14.3

getroot commented 2 years ago

OK Thank you, now we are figuring out the cause of the problem.

Keukhan commented 2 years ago

@dva-re

Let me ask you a few more questions.

1) What CPU are you using? clock, number of cores

2) What is the average CPU usage when there are no failures?

It might help me to understand the log.

Thanks

dva-re commented 2 years ago

@Keukhan hi

lscpu

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              2
Core(s) per socket:              8
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           85
Model name:                      Intel(R) Xeon(R) Silver 4215 CPU @ 2.50GHz
Stepping:                        7
CPU MHz:                         1900.005
CPU max MHz:                     3500.0000
CPU min MHz:                     1000.0000
BogoMIPS:                        5000.00
Virtualization:                  VT-x
L1d cache:                       256 KiB
L1i cache:                       256 KiB
L2 cache:                        8 MiB
L3 cache:                        11 MiB
NUMA node0 CPU(s):               0-15
Vulnerability Itlb multihit:     KVM: Mitigation: VMX disabled
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Mitigation; TSX disabled
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht t
                                 m pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpui
                                 d aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse
                                 4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat
                                 _l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid 
                                 ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflush
                                 opt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cq
                                 m_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities

current load (without problems)

image

dimiden commented 2 years ago

@dva-re I just found a way to reproduce the hang problem. Thank you very much for your help, and I will tell you again when the bug is fixed!

dva-re commented 2 years ago

Thank you!! I will wait with impatience and hope.

While it hasn't been fixed yet, is there anything I can do to minimize the chance of it happening?

getroot commented 2 years ago

This is reproduced when the network between Origin and Edge is not fast enough. In other words, if the network of Origin and Edge is faster than the bitrates of all tracks in the stream that Edge pulls from Origin, it will not be reproduced. If this problem is reproduced, the socket thread is blocked and is no longer available.

This will probably be fixed and committed today or tomorrow. And we will release 0.14.4 quickly.

dva-re commented 2 years ago

Guys, thank you very VERY much. Can you please let me know when this commit happens? I will switch to the dev branch before 0.14.4 is released

dimiden commented 2 years ago

@dva-re I fixed this problem and will release a new version when the stress test is completed. https://github.com/AirenSoft/OvenMediaEngine/commit/fd74fec49b2e5dd16bb868a131800714b72c9f40

dva-re commented 2 years ago

Thank you!

dva-re commented 2 years ago

Hello.

@getroot Maybe you can help me with this too?

As soon as I launched the third broadcast, it immediately became like this and began to slow down.

ome_1  | [2022-08-04 15:07:15.124] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1424, threshold: 120, peak: 1425
ome_1  | [2022-08-04 15:07:16.310] W [Rescaler:192] ov.Queue | queue.h:268  | [0x7f5c995d9c38] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1945, threshold: 120, peak: 1945
ome_1  | [2022-08-04 15:07:19.503] W [Rescaler:786] ov.Queue | queue.h:268  | [0x7f5d6b6b8598] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1780, threshold: 120, peak: 1780
ome_1  | [2022-08-04 15:07:20.154] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1464, threshold: 120, peak: 1465
ome_1  | [2022-08-04 15:07:21.310] W [Rescaler:192] ov.Queue | queue.h:268  | [0x7f5c995d9c38] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1987, threshold: 120, peak: 1987
ome_1  | [2022-08-04 15:07:24.517] W [Rescaler:786] ov.Queue | queue.h:268  | [0x7f5d6b6b8598] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1824, threshold: 120, peak: 1824
ome_1  | [2022-08-04 15:07:25.154] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1505, threshold: 120, peak: 1505
ome_1  | [2022-08-04 15:07:26.329] W [Rescaler:192] ov.Queue | queue.h:268  | [0x7f5c995d9c38] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 2031, threshold: 120, peak: 2031
ome_1  | [2022-08-04 15:07:29.533] W [Rescaler:786] ov.Queue | queue.h:268  | [0x7f5d6b6b8598] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1866, threshold: 120, peak: 1866
ome_1  | [2022-08-04 15:07:30.184] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1549, threshold: 120, peak: 1550
ome_1  | [2022-08-04 15:07:31.334] W [Rescaler:192] ov.Queue | queue.h:268  | [0x7f5c995d9c38] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 2077, threshold: 120, peak: 2077
ome_1  | [2022-08-04 15:07:34.540] W [Rescaler:786] ov.Queue | queue.h:268  | [0x7f5d6b6b8598] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1920, threshold: 120, peak: 1920
ome_1  | [2022-08-04 15:07:35.227] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1602, threshold: 120, peak: 1603
ome_1  | [2022-08-04 15:07:36.349] W [Rescaler:192] ov.Queue | queue.h:268  | [0x7f5c995d9c38] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 2130, threshold: 120, peak: 2130
ome_1  | [2022-08-04 15:07:39.554] W [Rescaler:786] ov.Queue | queue.h:268  | [0x7f5d6b6b8598] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1975, threshold: 120, peak: 1975
ome_1  | [2022-08-04 15:07:40.249] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1653, threshold: 120, peak: 1656
ome_1  | [2022-08-04 15:07:41.373] W [Rescaler:192] ov.Queue | queue.h:268  | [0x7f5c995d9c38] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 2184, threshold: 120, peak: 2185
ome_1  | [2022-08-04 15:07:44.601] W [Rescaler:786] ov.Queue | queue.h:268  | [0x7f5d6b6b8598] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 2027, threshold: 120, peak: 2029
ome_1  | [2022-08-04 15:07:45.289] W [Rescaler:882] ov.Queue | queue.h:268  | [0x7f5e4389ce98] Input queue of Encoder. codec(h264/27) size has exceeded the threshold: queue: 1712, threshold: 120, peak: 1714

The system load is not at the limit.

Does it make sense to increase some parameter in the config, or will only the GPU help me?

image

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.