video-dev / hls.js

HLS.js is a JavaScript library that plays HLS in browsers with support for MSE.
https://hlsjs.video-dev.org/demo
Other
14.78k stars 2.57k forks source link

Freezing on seeking and black screen on stream Load. #6576

Open brett-bryant opened 2 months ago

brett-bryant commented 2 months ago

What do you want to do with Hls.js?

Hi Everyone, I’ve had an issue I’ve been trying to track down for a few weeks now with not a lot of success.

Background: we have an RTMP stream from an AWS Elemental pushed to AWS Media Live and repackaged to HLS, then pushed to Media Package to be consumed by a vanilla html player in a .NET Core app running in a Docker container in AWS ECS.

Once the stream has been live for roughly 2 hrs, if you trigger a seek event sometimes (50/50 chance) the player will freeze in a buffering state and never recover unless you trigger another seek event.

After roughly 5 hrs of the stream running, If you enter the page the player will load with a black screen (player loadeddata event never fires) until you trigger some other event (play/seek) which will give it a nudge.

We are running with a 1sec fragment time due to some analytics that are running alongside the stream because they need to in sync as much as possible but I have removed these other analytics so they have no effect on the issue at all, I mention this this because I have noticed through testing that if I change the segment size to 2 seconds or 3 seconds then the it multiplies the time before the issue is triggered by I’ve had an issue I’ve been trying to track down for a few weeks now with not a lot of success.

Background: we have an RTMP stream from an AWS Elemental pushed to AWS Media Live and repackaged to HLS, then pushed to Media Package to be consumed by a vanilla html player in a .NET Core app running in a Docker container in AWS ECS.

Once the stream has been live for roughly 2 hrs, if you trigger a seek event sometimes (50/50 chance) the player will freeze in a buffering state and never recover unless you trigger another seek event.

After roughly 5 hrs of the stream running, If you enter the page the player will load with a black screen (player loadeddata event never fires) until you trigger some other event (play/seek) which will give it a nudge.

We are running with a 1sec fragment time due to some analytics that are running alongside the stream because they need to in sync as much as possible but I have removed these other analytics so they have no effect on the issue at all, I mention this this because I have noticed through testing that if I change the segment size to 2 seconds or 3 seconds then the it multiplies the time before the issue is triggered by the amount added, i.e:

These numbers are not exact, testing is very time consuming so pinpointing the exact minute the issue appears is difficult.

Trough debugging the HLS library we discovered a hotfix that we suspect is not a good long term solution but I mention it here as it may be related and someone might be able to shed some light on it. Basically in the _trySkipBufferHole method within the hls.js gap controller, when the issue has been triggered it is not skipping the hole correctly, the gap is between 0.5 – 1.1 while the maxBufferHole is set to a default of 0.1 (it seems the default is 0.1, contrary to the HLS documentation saying it’s 0.5).

image

The above affects a loop further down the line:

image

When the “const provisioned = fragmentTracker.getPartialFragment” row is executed the provisioned variable is null and thinks it has data to load and enters “if (moreToLoad) { return 0; }”. This is from debugging while the seeking buffer issue has been triggered.

When the video is more than 5 hrs long and the black screen issue has been triggered, this is again caused by the same area of code but the gapLength is even larger:

image

No matter how long the stream runs for the gapLength never exceeds 2 so the hotfix was the change the value of maxBufferHole to 2.

Unfortunately this is all run in a very secure environment so providing a test stream is out of the question which I understand makes it very difficult to debug, but I have attached some logs which I can’t see anything obvious in and I've included a playlist. If anybody has any insight as to why this could be happening or point me in a direction where I could look next it would be greatly appreciated.

Player Config:

video.hls = new Hls({
    xhrSetup: (xhr, url) => {
        xhr.open('GET', url, true);
        xhr.setRequestHeader("Stream-ApiKey", ******);
    },
});
video.hls.loadSource(url);
video.hls.attachMedia(video);
video.hls.on(Hls.Events.MEDIA_ATTACHED, function () {
    video.muted = true;
    video.play();
});

debug.log playlist.m3u8

What have you tried so far?

No response

robwalch commented 2 months ago

Hi @brett-bryant,

What version of hls.js are you running (in flowplayer)? Have you tried reproducing the issue in latest development (https://hlsjs-dev.video-dev.org/demo/) or hls.s stand-alone (https://hlsjs.video-dev.org/demo/)?

In the logs you provided, there are eleven seek events. Can you point out which ones were followed by stall? (I don't see a single instance of a stall being reported which leads me to believe these are incomplete or not associated with the issue you're describing.)

There are some hls.js configuration options that can help performance in long running live streams. Disabling the worker and setting back and front buffer length ejection thresholds may improve performance significantly.

If in this instance you are seeking back and finding that buffering starts later than the time at which you seeked to, and that interferes with playback, you should set maxFragLookupTolerance to 0. I didn't check all "seeking" and "buffed" log lines in the file provided but for the ones I did, audio and video was buffered for the time seeked to.

while the maxBufferHole is set to a default of 0.1 (it seems the default is 0.1, contrary to the HLS documentation saying it’s 0.5)

Thanks for spotting that. The documentation should be updated to reflect the default of 0.1.

brett-bryant commented 2 months ago

Hi @brett-bryant,

What version of hls.js are you running (in flowplayer)? Have you tried reproducing the issue in latest development (https://hlsjs-dev.video-dev.org/demo/) or hls.s stand-alone (https://hlsjs.video-dev.org/demo/)?

In the logs you provided, there are eleven seek events. Can you point out which ones were followed by stall? (I don't see a single instance of a stall being reported which leads me to believe these are incomplete or not associated with the issue you're describing.)

There are some hls.js configuration options that can help performance in long running live streams. Disabling the worker and setting back and front buffer length ejection thresholds may improve performance significantly.

If in this instance you are seeking back and finding that buffering starts later than the time at which you seeked to, and that interferes with playback, you should set maxFragLookupTolerance to 0. I didn't check all "seeking" and "buffed" log lines in the file provided but for the ones I did, audio and video was buffered for the time seeked to.

while the maxBufferHole is set to a default of 0.1 (it seems the default is 0.1, contrary to the HLS documentation saying it’s 0.5)

Thanks for spotting that. The documentation should be updated to reflect the default of 0.1.

Hi @robwalch thanks for responding so quickly.

Sorry I initially wrote all this in the bug template which failed to save due to comment character limit and forgot to rewrite it when posting this issue.

It was initially discovered using flowplayer 3.10.1 then upgraded to 3.12.0 but I’ve ruled out flowplayer by replicating it in a plain html video element and attaching hls 1.5.13 .

<video controls></video>
<script>
let video = document.querySelector ('video');
video.hls = new Hls({
    xhrSetup: (xhr, url) => {
        xhr.open('GET', url, true);
        xhr.setRequestHeader("Stream-ApiKey", ******);
    },
});
video.hls.loadSource(url);
video.hls.attachMedia(video);
video.hls.on(Hls.Events.MEDIA_ATTACHED, function () {
    video.muted = true;
    video.play();
});
</script>

Unfortunately, due to the high security access to the stream testing it in the hlsjs demo site is impossible but I think a plain html video element should cover this anyway.

I’ve attached brand now logs, 1 with the black screen issue on load and 1 with the seek issue, the seek issue logs contains 2 manual clicking on the timeline seek events, the second is the event that triggers the buffering freeze.

Thanks for your suggestions, I haven’t tested with lowLatencyMode enabled (I just discovered on the hlsjs demo page) and I will test your other suggestions as well and report back 😊.

buffer_on_seek.txt black_screen_on_load.txt

robwalch commented 2 months ago

May be related to #6571

brett-bryant commented 2 months ago

May be related to #6571

Thanks mate, I'll have a read.