androidx / media

Jetpack Media3 support libraries for media use cases, including ExoPlayer, an extensible media player for Android
https://developer.android.com/media/media3
Apache License 2.0
1.7k stars 406 forks source link

Frequent DashManifestStaleException errors on Live Dash manifests #1734

Open JonWatson opened 1 month ago

JonWatson commented 1 month ago

Version

Media3 1.4.1

More version details

This has been observed on Media3 1.2.1+ (may exist on previous versions as well)

Devices that reproduce the issue

Not device specific, occurs on a variety of devices and versions

Devices that do not reproduce the issue

None known

Reproducible in the demo app?

Not tested

Reproduction steps

Our professional sports and other live content is DRMed, thus we cannot attempt to reproduce this in the Demo app. We will send an email with a debuggable APK with EventLogger turned on and instructions on how to log in/find content.

We understand that our Live streams may have an issue we need to try to fix for Media3, but there are other players (such as Roku) which do not have this frequent problem playing the same Dash streams. We are hoping to identify the issue and get a Media3 fix or have our Video provider help us adjust the manifests/SSAI periods.

Note that I have also opened the following Issue that has an easy-to-recreate "indefinite buffering" issue that is related to our SSAI ad-break periods, where as this DashManifestStaleException error is periodic/random. I'm hoping the two are related and we can squash both at the same time, but that one may be worth looking at first.

Live DASH Manifest indefinite buffering occurs between SSAI ad period and content #1636

Expected result

No DashManifestStaleException

Actual result

Frequent DashManifestStale exceptions (maybe every 20-30 minutes)

Media

Sending an email with a debuggable APK, user credentials, recreation details.

Bug Report

JonWatson commented 1 month ago

APK and instructions have been sent. I'm available any time to help, thank you!

jobarros commented 1 month ago

I have the same issue when testing a manifest with key rotation. It happens everytime it needs to read a new period.

Media3 version 1.3.1 (I have also tried version 1.4.1)

The error is either DashManifestStaleException Caused by: androidx.media3.exoplayer.dash.DashManifestStaleException at androidx.media3.exoplayer.dash.DashMediaSource.onManifestLoadCompleted(DashMediaSource.java:683)

or IndexOutOfBoundsException java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.get(ArrayList.java:437) androidx.media3.exoplayer.dash.manifest.DashManifest.getPeriod(DashManifest.java:128)

I already use MediaItem.DrmConfiguration.Builder.setMultiSession(true) as specified by the documentation. https://developer.android.com/media/media3/exoplayer/drm#key-rotation

@tonihei Do you know already if it's an issue on the player's side or if it's something we can fix ourselves? Is there a workaround?

jobarros commented 1 month ago

I looked into the player code and saw the reasons for the exception to be throwed. https://github.com/androidx/media/blob/c35a9d62baec57118ea898e271ac66819399649b/libraries/exoplayer_dash/src/main/java/androidx/media3/exoplayer/dash/DashMediaSource.java#L655

The issue for us was resolved by adding incremental start times to each new period in the manifest.

JonWatson commented 1 month ago

I've been looking at a HAR file from an occurrence. I'm hoping a Media3 dev can help us determine what's going on here.

Last period in manifest before the problem is: <Period start="PT1H6M20.9113500S" id="ad-16-x-1-1">

Suggests Ad break has started and is playing successfully in the player

Last two periods in the next manifest are: <Period start="PT1H6M20.9113500S" id="ad-16-x-1-1"> <Period start="PT1H8M5.9829833S" id="src-17-src-0-2592">

I believe this is the point that ExoPlayer starts reporting a "buffering" state, and I can see the ..item_init.m4i URLs being loaded that suggests the player is trying to recover from some problem.

The period id suggests this next period is of the game (no longer playing Ads). However playback stalls here and we only see the last frame of the Ad Slate.

Last five periods in the next manifest are: <Period start="PT1H6M20.9113500S" id="ad-16-1-1"> (originally reported Ad period) <Period start="PT1H6M50.9203555S" id="ad-16-2-1"> (didn't see this Ad period in the previous manifest) <Period start="PT1H7M20.9293610S" id="ad-16-3-1"> (didn't see this Ad period in the previous manifest) <Period start="PT1H7M36.0067776S" id="ad-16-4-1"> (didn't see this Ad period in the previous manifest) <Period start="PT1H8M5.9829833S" id="src-17-src-0-2592">

ExoPlayer is having problems and continues to try to "recover" from some manifest sync issue. It finally gives up with DashManifestStaleException. These three Ad periods (ad-16-2-1, ad-16-3-1, ad-16-4-1) appear in the last manifest (before the Source period) when they were not included in the previous manifest

tonihei commented 1 month ago

Thanks for the analysis done here already! Having access to your test app is actually not that useful I'm afraid because we can't see the actual MPD updates happening to understand how they are handled. Even if they are DRM-protected, I think it would work to share the live stream itself as we don't actually need to decode the video to see the MPD update logic.

However, your last post may just hold the answer already. The update between the second and third manifest is not valid I believe if I understand the DASH interop guidelines correctly. The guidelines say about live MPD updates that updates in the MPD only extend the timeline. This means that information provided in a previous version of the MPD shall not be invalidated in an updated MPD (section 4.4.3.3. in the IOP guidelines v4.3).

The second manifest has the period "ad-16-x-1-1" with an implicit duration of about 1:45 minutes (until the next declared Period). The third manifest update then inserts new Periods in the middle, changing the duration of the already defined period "ad-16-x-1-1" to about 30 seconds. This is the part that is not officially allowed according to the rule above I think. The IOP guidelines can be quite complex and I may be wrong on this part, so if you feel like this is a valid update, please let me know and point me to the sections that describe this kind of operation.

Coming back to the actually observed issues: