SRGSSR / pillarbox-apple

A next-generation reactive media playback ecosystem for Apple platforms.
https://testflight.apple.com/join/TS6ngLqf
MIT License
43 stars 6 forks source link

Extract resource metadata #815

Closed defagos closed 1 month ago

defagos commented 1 month ago

As a user I want useful information to be displayed on screen when I play content. As a developer integrating Pillarbox I need a way to display metadata to the user, both on custom user interfaces as well as in the Control Center.

Acceptance criteria

Tasks

defagos commented 1 month ago

If found why metadata is not displayed in the tvOS Control Center. As written in this article we must not use MPNowPlayingSession with an AVPlayer instance managed by AVPlayerViewController, otherwise behaviors will conflict.

It namely suffices to instantiate our session with an unrelated player to have correct behavior on tvOS (long-pressing the home button makes the Control Center appear with an audio icon to access now playing info):

nowPlayingSession = MPNowPlayingSession(players: [AVPlayer()])

This is likely a task in itself and should be moved to another issue. The tricky part is that a presentation choice defines how the chromeless Player should behave. Open questions:

A working implementation strategy could be:

defagos commented 1 month ago

The AVMetadataItem.metadataItems(from:filteredAndSortedAccordingToPreferredLanguages:) method has the following behavior:

AVPlayerViewController does not seem to use this API, it rather displays the first available entry for some identifier, no matter the language. I am not sure this is the experience we want, though, since this might lead to an inconsistent metadata display that is confusing for the user.

defagos commented 1 month ago

According to the header documentation (not sure it is up to date):

// -----------------------------------------------------------------------------
// MPNowPlayingInfoCenter provides an interface for setting the current now 
// playing information for the application. This information will be displayed 
// wherever now playing information typically appears, such as the lock screen
// and app switcher. The now playing info dictionary contains a group of 
// metadata properties for a now playing item. The list of property constants
// is available in <MediaPlayer/MPMediaItem.h>. The properties which are 
// currently supported include:
// 
// MPMediaItemPropertyAlbumTitle
// MPMediaItemPropertyAlbumTrackCount
// MPMediaItemPropertyAlbumTrackNumber
// MPMediaItemPropertyArtist
// MPMediaItemPropertyArtwork
// MPMediaItemPropertyComposer
// MPMediaItemPropertyDiscCount
// MPMediaItemPropertyDiscNumber
// MPMediaItemPropertyGenre
// MPMediaItemPropertyPersistentID
// MPMediaItemPropertyPlaybackDuration
// MPMediaItemPropertyTitle
//
// In addition, metadata properties specific to the current playback session
// may also be specified -- see "Additional metadata properties" below.

Testing each key individually it really seems that only MPMediaItemPropertyTitle and MPMediaItemPropertyArtist are meaningful for display in the Control Center.

defagos commented 1 month ago

Title and subtitle on the tvOS system UI (and correspondingly in the Control Center) will be aligned with best practices described in the official documentation and implemented in tv+, for example.

defagos commented 1 month ago

We went full circle on that one. We iterated a lot, attempted to make our API as powerful and flexible as possible, before realizing that it would only make things more complicated for developers in the end, while not adding much value.

Stream metadata

Following #810 we had several options to extract as much metadata as possible from the stream (e.g. ICY / ID3 or chapters). We extracted:

Combining all these sources we had as much metadata from the stream as possible, but we were lacking clear rules for data extraction. Not only is extraction tricky (because we need to use asynchronous APIs to read values from an AVMetadataItem), it was also extremely difficult to extract data in a meaningful way. Some AVMetadataItems might namely be lacking an identifier or a language tag (e.g. in the chapters we could read from an MP4). And even if we have a key that could be turned into an identifier associated with some space, we cannot just pick a space at random.

Because we cannot make an opinionated choice about metadata identifiers and languages, we realized we should not attempt to structure and alter data coming from the stream in any way.

Metadata consolidation

To consolidate metadata we turned metadata readily available (from the PlayerItem asset publisher) into AVMetadataItems. For reasons mentioned above this makes things more complicated, as readily available metadata must now be read asynchronously.

Moreover this raised a lot of questions about priority, since metadata items are delivered and naturally combined as arrays. How should metadata from the PlayerItem asset and the stream be combined and in which order? How could we alter the priority to make the asset metadata or the stream metadata have more weight? There is simply no obvious and simple solution which we could reasonably document and explain.

Custom metadata mapping

We had introduced a metadata adapter to let developers customize the metadata mapping, but in the end the rich list of possible identifiers was making our API difficult to use. In general AVPlayerViewController and NowPlayingSession only understand a few identifiers, making flexible mappings less relevant.

Instead we decided that, as was the case a few commits ago, mapping would be owned by the data source, which knows best what a title, a subtitle or an image to associate with the content must be. There is little to gain in making the mapping more flexible so that any data source can be mapped in an arbitrary way. There is a lot to lose, though, as explaining all these keys, why some are meaningful and others aren't, would have been difficult.

Final solution

We therefore:

This way our implementation is simple to understand and use. There is a single metadata delivery channel with a single mapping within it, controlled by the data source, with clearly identified concepts like title, subtitle or image. Mapping to obscure AVMetadataItem identifiers or NowPlayingSession info is implemented by us internally, requiring no explanations to developers implementing their own data sources.

Further options

We could still have provided the publishers we had for stream metadata extraction. Most of them could be recovered from the archived branch associated with this issue, but in the end the issue of reading AVMetadataItem asynchronously remains, and there is no simple recipe to interpret the delivered data since, as said above, identifiers, language tags or keys are quite difficult to pick from. For this reason it is likely that no such API would really be helpful.