androidx / media

Jetpack Media3 support libraries for media use cases, including ExoPlayer, an extensible media player for Android
https://developer.android.com/media/media3
Apache License 2.0
1.56k stars 373 forks source link

Frame accurate metadata rendering #899

Closed cdongieux closed 6 months ago

cdongieux commented 9 months ago

Hi,

This question is related to a previous question I asked here. I want to be able to play a TS stream carrying data announcing at what time an ad will start and play with another ExoPlayer instance a replacement ad given by an ad server in place of the ad in the TS stream.

As an input I have a TS stream with a SCTE-35 track (application/x-scte35), this track signals SCTE-35 TimeSignalCommands. I made a development to add support for segmentation descriptors (Section 10.3.3 of the spec) embedded inside TimeSignalCommand. These segmentation descriptors describe ad events (ad server call, break start/end, ad start/end, etc.). All of this data is output as Metadata by SpliceInfoDecoder. When playing the TS stream, I listen to Metadata with Player.Listener.onMetadata(Metadata), I handle TimeSignalCommand with its segmentation descriptors with something like this:

class Scte35MetadataListener(
    private val player: ExoPlayer, // the player which plays the TS stream
    private val adEventsListener: AdEventsListener // listener to trigger an ad server call, a playback of a replacement ad, etc.
) : Player.Listener {
    private val handler = Handler(player.applicationLooper)

    override fun onMetadata(
        metadata: Metadata
    ) {
        for (i in 0 until metadata.length()) {
            val entry = metadata[i]
            var metadataPlaybackPositionUs = 0L
            if (entry is TimeSignalCommand) {
                    // entry.playbackPositionUs has its initial position offset set to
                    // INITIAL_RENDERER_POSITION_OFFSET_US, so we have to remove it.
                    metadataPlaybackPositionUs = entry.playbackPositionUs -
                            INITIAL_RENDERER_POSITION_OFFSET_US
            }
            for (descriptor in entry.descriptors) {
                handleSegmentationDescriptor(descriptor, metadataPlaybackPositionUs)
            }
        }
    }

    private fun handleSegmentationDescriptor(
        descriptor: SegmentationDescriptor,
        metadataPlaybackPositionMs: Long
    ) {
        // Get info from SegmentationDescriptor to determine what type of segmentation type and related properties we're dealing with
        when (descriptor.segmentationTypeId) {
            SegmentationDescriptor.SEGMENTATION_TYPE_AD_SERVER_CALL -> {
                // make an API call to an ad server to know what ads will be replaced, and replacement ads URIs
                ...
            }
            SegmentationDescriptor.SEGMENTATION_TYPE_PROVIDER_AD_START -> {
                delayPlayerAction(metadataPlaybackPositionMs) {
                    // start the other ExoPlayer instance to play the replacement ad
                    adEventsListener.onAdStart(...)
                }
            }
            ...
        }
    }

    private fun schedulePlayerAction(
        playerPositionMs: Long,
        action: () -> Unit
    ) = player
        .createMessage { messageType, _ ->
            handler.post(action)
        }
        .setType(MSG_TYPE)
        .setPosition(playerPositionMs)
        .send()
}

It turns out that using player.createMessage() is not accurate and there can be a lot of delay (between 10 milliseconds and more than 1 second) between the expected player position at which I want the action to be ran and the actual position. This delay can lead to an inaccurate visual synchronisation between the ad start in the TS video stream and the replacement ad start.

So my question is: do you have an idea of a better way to achieve frame accuracy, or at least a better accuracy? Would a custom renderer be a solution?

tonihei commented 9 months ago

It turns out that using player.createMessage() is not accurate

I'm sure the message itself is perfectly accurate, but I assume the delay comes from other places in the set up.

My understanding is that you either pause the original playback (or even keep it running?) at the playback position of the ad insertion point and then send a message to another player to start playing the ad.

Potential issues:

To support your use case properly, I'd recommend integrating with AdsLoader:

cdongieux commented 9 months ago

Thank your for your answer.

To be clearer, here are few answers to your questions:

I'll take a look on AdsLoader and let you know. Thanks.

cdongieux commented 8 months ago

OK, I'm doing some experiments with your advice about using a custom AdsLoader. Is there a way for the original content to not stop playing while an ad is playing? The original content is a live broadcast TS stream with no timeshift buffer, so I'm not able to seek it.

tonihei commented 8 months ago

Is there a way for the original content to not stop playing while an ad is playing?

Not really, unless you do it in the way you are already doing it. ExoPlayer's way of integrating with ad playback is to avoid this problem by not loading media twice or holding onto multiple decoders in parallel.

I'm afraid this means my AdsLoader suggestion won't work I thought it would...

tonihei commented 6 months ago

I'm closing this issue because it seems there is no follow-up discussion needed. If this investigation resulted in a new feature request, please file a new one focused on the concrete issue.