AirenSoft / OvenMediaEngine

OvenMediaEngine (OME) is a Sub-Second Latency Live Streaming Server with Large-Scale and High-Definition. #WebRTC #LLHLS
https://airensoft.com/ome.html
GNU Affero General Public License v3.0
2.57k stars 1.06k forks source link

Buffer stalls with SRT + LLHLS #1563

Closed SceneCityDev closed 7 months ago

SceneCityDev commented 7 months ago

Hi getroot,

I think I need a helping hand here.

I have a HEVC stream coming in to the origin via SRT, and then getting published as LLHLS on an Edge. My Chrome browsers supports HEVC playback. In theory all works fine.

But no matter what I try, the LLHLS stream is getting buffer stalls all the time. It's not a matter of not having enough bandwidth or CPU on the Source, Origin, Edge or Client PC - it happens on all computers I have tried. Even on a 1 GBit/s connection with a 2 MBit/s stream it happens. I tried both OBS as a source, and by having ffmpeg deliver a TS as an alternative. And even if I still have a buffer of a full minute (!) ahead of me in the client, every couple of seconds there will be a buffer stall. I looked at it in detail in hls.js'es debugger, too. I don't get it.

I tried your recommended settings for Chunk Duration, Segment Duration, Segment Count and the mysterious Part Hold Back, and tried hundreds of other combinations. But even with Chunk Duration(10.00) Segment Duration(30) Segment Count(50): Buffer stalls.

If I play the stream in VLC, it works perfectly - but of course VLC does not do LLHLS. So I know HLS fallback works. Telling Ovenplayer or the server to treat this as HLS and not LLHLS however doesn't.

I get the feeling that somehow OME is delivering segments to the client that are there, but in some why not usable. Maybe chunks are marked Independent while they aren't, or they are re-ordered.

There is a warning in the log popping up every couple of seconds, and I find it plausible that it might be related:

[2024-03-23 10:22:10.413] W [AW-LLHLS0:1144] FMP4 Packager | fmp4_storage.cpp:341 | LLHLS stream (#default#origin2/hevctest) / track (1) - a longer-than-expected (8383.3 | expected : 6000) segment has created. Long or irregular intervals between keyframes might be the cause. Of course I have 1 second keyframes.

So I tried increasing the segment size. When I increased to 10000 OME decided to now create 12000+ sized frames. I went all the way up to...

[2024-03-23 10:13:40.425] W [AW-LLHLS0:225167] FMP4 Packager | fmp4_storage.cpp:341 | LLHLS stream (#default#origin2/hevctest) / track (4) - a longer-than-expected (39808.0 | expected : 30000) segment has created. Long or irregular intervals between keyframes might be the cause.

...when it became very clear that no matter what size I choose, the generated ones will always be bigger.

The only thing that made things somewhat (but not perfectly) stable was to have ChunkDuration = SegmentDuration, so basically killing any option for the LLHLS client to try to range requests.

I have reset the config to your recommended values:

<PartHoldBack>1.5</PartHoldBack>
<ChunkDuration>0.5</ChunkDuration>
<SegmentDuration>6</SegmentDuration>
<SegmentCount>10</SegmentCount>

A test stream showing the problem is available here:

Origin: HEVC + AAC Passthrough: https://origin2.scenecity.net/origin2/hevctest/hevc-bypass_llhls.m3u8 HEVC transcoded (QSV) + AAC Passthrough: https://origin2.scenecity.net/origin2/hevctest/hevc-hq_llhls.m3u8 HEVC rescaled, transcoded (QSV) + AAC Re-encoded: https://origin2.scenecity.net/origin2/hevctest/hevc-lq_llhls.m3u8

And the same on the Edge: https://fsn1-edge3.scenecity.net/origin2/hevctest/hevc-bypass_llhls.m3u8 https://fsn1-edge3.scenecity.net/origin2/hevctest/hevc-hq_llhls.m3u8 https://fsn1-edge3.scenecity.net/origin2/hevctest/hevc-lq_llhls.m3u8

Again: All of these work fine when played in VLC, but none works without those buffer stalls with Ovenplayer or the llhls.js debugger.

Any help would be highly appreciated. It would also be no problem to send the srt test stream to an origin of yours, of course.

It's weird that you prefer LLHLS over HLS. Reading through the protocol during the last night made by brain bleed. What a complex monster of a stream transport protocol...

Thank you!

br, scamp

bchah commented 7 months ago

+1, I've seen this behaviour too. It helped a bit adding the secret <PartHoldBack> tag to the LLHLS config, for how many seconds of segments you want to cache - I set it to 2.0

bchah commented 7 months ago

A couple other ideas - make sure HLS.js is up to date and check that your keyframe interval divides evenly into your segment duration. I usually use 2s keyframe interval.