Define restrictions on parallel media properties

marisademeglio commented 5 years ago

We often show how a playback object may contain text and audio, e.g.:

narration: [
        {
           text: #idfrag, 
           audio: file.mp3#t=1,10
         }, 
         ...
]

But we need to define what is required and optional here, i.e. how many text and audio properties are allowed or required.

Options:

require one text + one audio: Simple for user agents, covers text+audio use case, but not audio-only
require at least audio: Covers audio-only use case
require nothing: flexible but also opens the door for multiple parallel media properties of the same type (e.g. multiple audio, multiple text), which does not have defined playback behavior
restrict to at most one of each media property type: would allow a playback object to contain: just text OR just audio OR text + audio together. There is currently no defined playback behavior for a standalone text property with no associated timing information.

iherman commented 5 years ago

My gut feeling is that, at least in V1, we should make it simple to cover 70-80% of the use cases. Which probably rules out (3) for me.

marisademeglio commented 5 years ago

I agree. There's no use case that would require the complexity of (3). Personally, I like (4) because it's generic and also simple. It would allow having a series of text-only playback objects, which is currently not required by any of our use cases; however, I can envision an interesting scenario where there is no audio but you're manually controlling highlight progression (e.g. "next paragraph") - perhaps to help readers focus on a small portion of text at once.

marisademeglio commented 3 years ago

If we see this as a MO evolution, replacing epub:textref properly would require (3), so that's what I described in the new draft

marisademeglio commented 3 years ago

There are no restrictions on parallel media objects: https://w3c.github.io/sync-media-pub/sync-media.html#media-objects

Playback behavior is defined as "wait for the longest timed media object to finish" aka SMIL endsync. TBD: do we need to define what the other objects do during this time, e.g. fill=freeze. I believe @larscwallin has some opinions ;)

w3c / sync-media-pub

Define restrictions on parallel media properties #9