androidx / media

Jetpack Media3 support libraries for media use cases, including ExoPlayer, an extensible media player for Android
https://developer.android.com/media/media3
Apache License 2.0
1.54k stars 373 forks source link

Repeated characters in CEA 608 captions #887

Open strangesource opened 8 months ago

strangesource commented 8 months ago

Version

Media3 1.2.0

More version details

Happens on earlier versions as well.

Devices that reproduce the issue

Not dependent on any device. Reproducible e.g. on Pixel 6a or Pixel 6 emulator with Android 12/13/14

Devices that do not reproduce the issue

/

Reproducible in the demo app?

Yes

Reproduction steps

  1. Load stream (will share via e-mail)
  2. Enable en subtitle
  3. Observe subtitles

Expected result

Subtitles displayed normally, e.g. Pork chops!

Actual result

This is a potential duplicate of https://github.com/google/ExoPlayer/issues/10209#issue-1210059727. If you think I should rather add these findings to the existing issue in the ExoPlayer repository please let me know.

The asset is a DASH stream with the following Acessibility tag:

<Accessibility schemeIdUri="urn:scte:dash:cc:cea-608:2015" value="CC1=eng;CC3=spa" />

While the tag advertises two CC subtitle tracks, the CC3 (spanish) track is actually not present. While this is obviously not ideal, the resulting Behaviour in ExoPlayer is quite unexpected:

Cues have repeating characters in pairs of two: PoPorkrk c chohopsps!! instead of Pork chops!

In addition to this, the following error is reported in logcat without crashing:

  java.lang.IllegalStateException: Different languages combined in one TrackGroup: 'en' (track 0) and 'es' (track 1)
  at androidx.media3.common.TrackGroup.logErrorMessage(TrackGroup.java:234)
  at androidx.media3.common.TrackGroup.verifyCorrectness(TrackGroup.java:201)
  at androidx.media3.common.TrackGroup.<init>(TrackGroup.java:94)
  at androidx.media3.exoplayer.dash.DashMediaPeriod.buildPrimaryAndEmbeddedTrackGroupInfos(DashMediaPeriod.java:716)
  at androidx.media3.exoplayer.dash.DashMediaPeriod.buildTrackGroups(DashMediaPeriod.java:523)
  at androidx.media3.exoplayer.dash.DashMediaPeriod.<init>(DashMediaPeriod.java:158)
  at androidx.media3.exoplayer.dash.DashMediaSource.createPeriod(DashMediaSource.java:544)
  at androidx.media3.exoplayer.source.MaskingMediaPeriod.createPeriod(MaskingMediaPeriod.java:130)
  at androidx.media3.exoplayer.source.MaskingMediaSource.onChildSourceInfoRefreshed(MaskingMediaSource.java:196)
  at androidx.media3.exoplayer.source.WrappingMediaSource.onChildSourceInfoRefreshed(WrappingMediaSource.java:132)
  at androidx.media3.exoplayer.source.WrappingMediaSource.onChildSourceInfoRefreshed(WrappingMediaSource.java:47)
  at androidx.media3.exoplayer.source.CompositeMediaSource.lambda$prepareChildSource$0$androidx-media3-exoplayer-source-CompositeMediaSource(CompositeMediaSource.java:117)
  at androidx.media3.exoplayer.source.CompositeMediaSource$$ExternalSyntheticLambda0.onSourceInfoRefreshed(Unknown Source:4)
  at androidx.media3.exoplayer.source.BaseMediaSource.refreshSourceInfo(BaseMediaSource.java:90)
  at androidx.media3.exoplayer.dash.DashMediaSource.processManifest(DashMediaSource.java:905)
  at androidx.media3.exoplayer.dash.DashMediaSource.onManifestLoadCompleted(DashMediaSource.java:683)
  at androidx.media3.exoplayer.dash.DashMediaSource$ManifestCallback.onLoadCompleted(DashMediaSource.java:1362)
  at androidx.media3.exoplayer.dash.DashMediaSource$ManifestCallback.onLoadCompleted(DashMediaSource.java:1357)
  at androidx.media3.exoplayer.upstream.Loader$LoadTask.handleMessage(Loader.java:480)
  at android.os.Handler.dispatchMessage(Handler.java:106)
  at android.os.Looper.loopOnce(Looper.java:201)
  at android.os.Looper.loop(Looper.java:288)
  at android.os.HandlerThread.run(HandlerThread.java:67)

But I don't think that this is the root cause of the problem but rather a general issue in how multiple embedded CEA 608 tracks are represented. See https://github.com/androidx/media/commit/76700e9d84b58df11b0dd4ff749c6b5597ea34cc for reference.

When looking into this topic in a bit more detail, it seems like the Cea608Decoder.decode function is called twice with the same inputBuffer. As each call decodes two characters (ccData1 and ccData2), the result is the repeating pattern of two characters. If the Accessibility tag only contains a single CC track, it is only called once.

Another observation is that the TextRenderer that holds the Cea608Decoder has the wrong Format (Spanish when English is selected) in the formatHolder property. IMO this has to do with how ChunkSampleStream.selectEmbeddedTrack selects the track solely base on trackType which is obviously TRACK_TYPE_TEXT for both embeddedTrackFormats as they are both subtitle tracks.

The first embeddedSampleQueues it encounters is - by coincidence - the Spanish one. Worth mentioning that forcing the correct embeddedSampleQueues to be used does not fix the problem, probably because they both contain the same content as they are filled with the same samples in CeaUtil.consume called from the FragmentedMp4Extractor.

Media

Stream was shared via e-mail.

Bug Report

icbaker commented 8 months ago

Thanks for the stream - is it possible the content it's currently playing is different? In the DASH manifest being served right now I see this 4 times, but no tags with spanish or spanish+english:

<Accessibility schemeIdUri="urn:scte:dash:cc:cea-608:2015" value="CC1=eng" />

That said, the subtitles also don't seem to be functioning correctly at the moment - they appear and almost immediately disappear again, making them very hard to read (to the extent that I can't really tell if they also have the duplication problem you describe). I tried the same stream in the dash.js player and it has a different problem: a subtitle appears, and then stays on the screen for ages (1-2mins), even while lots of dialogue continues. Eventually it's replaced by another subtitle (which appears at the right time) then stays for another 1-2mins. So I conclude from this that the subtitles in the stream at the moment are likely invalid in some way, and ExoPlayer and dash.js just handle that invalid-ness in different ways.


Given that, I'm afraid it's difficult to really progress the investigation further - since it looks like the subtitles in this stream are not playable correctly by other players either.

strangesource commented 8 months ago

That's unfortunate. Let me share a stream with you that I captured when I was able to reproduce this. The manifest is a bit odd as I had a go at converting it from a dynamic to a static manifest for testing purposes but it reproduces the issue.

strangesource commented 8 months ago

Btw, I also tried the stream with the duplication issue in dash.js as well as the Bitmovin web player and both play it fine. Unfortunately not my frankenstein stream. 😬

strangesource commented 8 months ago

Another observation, this seems to be reproducible with ANY DASH manifest with 608 captions that falsely advertises an additional track. E.g. with https://dash.akamaized.net/dash264/TestCases/4c/1/dash.mpd, if you replace

<Accessibility schemeIdUri="urn:scte:dash:cc:cea-608:2015"/>

with

<Accessibility schemeIdUri="urn:scte:dash:cc:cea-608:2015" value="CC1=eng;CC3=spa" />

a similar behaviour can be observed.

icbaker commented 8 months ago

That said, the subtitles also don't seem to be functioning correctly at the moment - they appear and almost immediately disappear again, making them very hard to read (to the extent that I can't really tell if they also have the duplication problem you describe).

Upon further testing, I'm only seeing this on the main branch and not with version 1.2.0, so this look like a separate regression in our DASH CEA-608 handling. It doesn't occur on our HLS sample with CEA-608 subtitles (Apple 16x9 basic stream (TS)). I'll look into this separately.


I found two places in our code that we seem to handle nodes like <Accessibility schemeIdUri="urn:scte:dash:cc:cea-608:2015" value="CC1=eng;CC3=spa" />.

Here in DashMediaPeriod.getClosedCaptionTrackFormats(...): https://github.com/androidx/media/blob/f6fe90f30ba022c7e04e14a3dc5c28c568a4d1e2/libraries/exoplayer_dash/src/main/java/androidx/media3/exoplayer/dash/DashMediaPeriod.java#L886-L893

Which calls through to parseClosedCaptionDescriptor(...): https://github.com/androidx/media/blob/f6fe90f30ba022c7e04e14a3dc5c28c568a4d1e2/libraries/exoplayer_dash/src/main/java/androidx/media3/exoplayer/dash/DashMediaPeriod.java#L908-L932

And also, separately, in DashManifestParser.parseCea608AccessibilityChannel(...): https://github.com/androidx/media/blob/f6fe90f30ba022c7e04e14a3dc5c28c568a4d1e2/libraries/exoplayer_dash/src/main/java/androidx/media3/exoplayer/dash/manifest/DashManifestParser.java#L1833-L1847

The DashManifestParser method doesn't seem to be used when playing the manifest you provided. I replaced its contents with just throw RuntimeException() and playback still proceeded.

I was able to reproduce the garbled output you describe by hacking the code in DashMediaPeriod.parseClosedCaptionDescriptor(...) to rewrite "CC1=eng" to "CC1=eng;CC3=spa":

if (value.equals("CC1=eng")) {
  value = "CC1=eng;CC3=spa";
}

I also tried the same hack on 1.1.1 and saw the same duplication problem, i.e. this isn't a recent regression (mainly checking this because we have made some major subtitle infrastructure changes between 1.1.1 and 1.2.0).

I can't immediatley see a difference in the way that Cea608Decoder/Parser is being instantiated in the happy vs broken case (accessibilityChannel=1 in both cases), I'll continue investigating but currently I think the difference is going to be in what data is being fed into the decoder.

strangesource commented 8 months ago

Thanks for looking into this, this aligns with what I observed.

I can't immediatley see a difference in the way that Cea608Decoder/Parser is being instantiated in the happy vs broken case (accessibilityChannel=1 in both cases), I'll continue investigating but currently I think the difference is going to be in what data is being fed into the decoder.

My observation is that the decode function of the Cea608Decoder is called twice with the same data in the broken case. So TextRenderer.nextInputBuffer is null in the non-broken case after receiving two characters but in the broken case it is still non null resulting in another decode call. I could not yet wrap my head around why this is the case though.

icbaker commented 8 months ago

I agree that merging the english and spanish tracks into a single track group seems wrong. I didn't really know where to start, so I thought I'd see if I could resolve that on the offchance it fixed the duplication, and it seems to have done so... I need to spend a bit more time on this, partly to understand why that works - and will also try and add some automated tests.

Thanks for the clue - that was a useful place to start digging :)

icbaker commented 8 months ago

That said, the subtitles also don't seem to be functioning correctly at the moment - they appear and almost immediately disappear again, making them very hard to read (to the extent that I can't really tell if they also have the duplication problem you describe).

Upon further testing, I'm only seeing this on the main branch and not with version 1.2.0, so this look like a separate regression in our DASH CEA-608 handling. It doesn't occur on our HLS sample with CEA-608 subtitles (Apple 16x9 basic stream (TS)). I'll look into this separately.

Filed https://github.com/androidx/media/issues/904 to track this 'immediately disappearing' issue separately.

jamesdavidholding commented 8 months ago

Hi @icbaker thanks for digging into this in December, It would be great to get this closed off if we can. Is there anything further you need from us to continue with this? if so please let us know. Thanks James

icbaker commented 8 months ago

I've looked back into this again, and splitting the different languages into separate TrackGroups no longer seems to resolve the issue - it's possible that I wasn't testing what I thought I was when I saw it fixing things last time. So I'm afraid I'm no closer to resolving this. I'll leave the issue open.