w3c / audiobooks

Audiobook profile of a Web Publication
https://w3c.github.io/audiobooks/
Other
29 stars 9 forks source link

duration property when using t= fragment selectors #110

Open Addeventure opened 2 years ago

Addeventure commented 2 years ago

Hi!

It is a bit unclear how to treat the "duration" property on readingOrder level in the manifest when media fragment selectors are used.

Lets define the resource audio.mp3 as an audio file that is 20 seconds long. According to the spec, the duration property is the duration of the resource.

But then consider this partial manifest:

{
    "duration": "PT10S", // This is the overall length of the audiobook when played from start to end.
    "readingOrder": [
        {
            "url": "audio.mp3#t=10,20"
            "duration": "PT20S"  // The resource duration is 20s long. But we will only play the interval 10-20s
        }
    ]
}

The above JSON fails the spec according to: https://www.w3.org/TR/audiobooks/#audio-manifest-processing since the durations does not match.

So should "duration" always be the duration of the resources regardless of media fragment selectors? If so, it gives a false indication of the actual duration when the audiobook is played. In the example above, the global "duration" in the manifest would be set to 20 seconds, but it would actually be 10 seconds when playing the audiobook.

iherman commented 2 years ago

This issue was discussed in a meeting.

wareid commented 2 years ago

So I've unfortunately found the reason why this was allowed in the first place, and can't make the resolved change without more discussion.

We missed the section in the spec that states:

An audio resource can be referenced in its entirety via a URL, or for content where multiple chapters occupy a single file by using media fragments to locate the exact starting and end points.

I don't see this as a complete blocker to the proposed solution, but there's a few things we need to consider:

  1. If we remove fragment identifiers from url in the reading order, the only alternative that achieves the same goal (identifiable navigation in the audiobook) is to make the TOC mandatory, and urge content creators and implementors to use it as the source of truth for the navigation of the audiobook.
  2. What are the implications of making the TOC mandatory?
  3. Is there a workaround where we keep fragment identifiers in the url, but be clearer about the use cases for it?
llemeurfr commented 2 years ago

where multiple chapters occupy a single file ...

-> this is about ToC, not reading order then.

If the ToC is missing, the reading order is a fallback ToC, labels on the items of the reading order being used as ToC labels. In this case, we can add a note stating that the publisher must be careful having semantic "chapters" corresponding to physical tracks.

iherman commented 2 years ago

Actually, isn't it possible to build an audiobook that is made up of fragments of different files? Something like:

{
    "readingOrder": [
        ...
        {
            "url": "audio_1.mp3#t=10,20"
            "duration": "PT10S"  
        },
        {
            "url": "audio_2.mp3"
            "duration": "PT10S"  
        },
        {
            "url": "audio_1.mp3#t=40,50"
            "duration": "PT10S"  
        }
       ...
    ]
}

A (albeit artificial) example might be a book containing an analysis of a musical piece, where the narrator's story line (audio_2) is alternating with extracts of music (in audio_1). It then makes a perfect sense to build up the reading order with references using media fragments.

I guess this questions the wisdom of the resolution we took: we should not remove the possibility of using media fragment in the reading order in the first place, should we?

I am afraid, this leads us to the requirement, answering the original issue, that the value of duration MUST be equal with the duration of the audio as expressed in the media fragment, and the audiobook player MUST signal an error if that is not the case. Also, the value of duration in the linked resource is not required, as far as I know, so we could put an extra statement whereby if the URL expresses the duration, then the value derived from the URL takes precedence (and it replaces the values of duration). Note that this change is then to be done in the publication manifest spec, as opposed to the audiobook one.

Addeventure commented 2 years ago

Those are valid points. However, it seems to me that the readingOrder have multiple purposes which interfere with each other. It tries to be:

In https://www.w3.org/TR/pub-manifest/#default-reading-order it says: "Resources SHOULD NOT be listed more than once in the reading order, as this can lead to unexpected results in user agents (e.g., links to the resource might not resolve to the right instance in the reading order)."

In https://www.w3.org/TR/audiobooks/#audio-readingorder in the first NOTE section it says: "It is important to note that a resource cannot be referenced more than once in the reading order. In the case where an audio file represents the content of multiple chapters or sections of the book, the table of contents can be used to specify the starting and ending points of those chapters in the larger audio file, as demonstrated in this example."

So for example the use-case with the manifest:

 {
    "readingOrder": [
        ...
        {
            "url": "audio_1.mp3#t=10,20"
            "duration": "PT10S"  
        },
        {
            "url": "audio_2.mp3"
            "duration": "PT10S"  
        },
        {
            "url": "audio_1.mp3#t=40,50"
            "duration": "PT10S"  
        }
       ...
    ]
}

an implementation of the spec will emit errors.

If we want to keep the fragment selector in these urls, I think we need to relax the requirement of uniqueness among resources in readingOrder, otherwise it will quickly become too restrictive to for audiobook creators and require them to process/slice the audio files anyway (in which case they don't need fragment selectors anyway)

Addeventure commented 2 years ago

Removing uniqueness requirements adds other complexities though. For example, the spec needs to define how an implementation should behave if fragment selectors intersect.

Consider the following manifest:

 {
    "readingOrder": [
        ...
        {
            "url": "audio_1.mp3#t=10,20"
            "duration": "PT10S"  
        },
        {
            "url": "audio_2.mp3"
            "duration": "PT10S"  
        },
        {
            "url": "audio_1.mp3#t=15,25"
            "duration": "PT10S"  
        }
       ...
    ]
}

If a ToC item is pointing to audio_1.mp3#t=15 we don't know if the intention is to go to the first or third readingOrder item.

iherman commented 2 years ago

Sigh... You are right @Addeventure, I stand corrected…

In a way, what this means is that the readingOrder may be a misnomer, and it is more some sort of fetchingOrder of some sort, instructing the reading system the order in which the resources must be fetched/streamed. My original example would then be

{
    "readingOrder": [
        ...
        {
            "url": "audio_1.mp3"
            "duration": "PT10S"  
        },
        {
            "url": "audio_2.mp3"
            "duration": "PT10S"  
        }
       ...
    ]
}

and the TOC would contain pointers to:

            <li><a href="audio_1.mp3#t=10,20">…</a></li>
            <li><a href="audio_2.mp3">…</a></li>
            <li><a href="audio_1.mp3#t=10,20">…</a></li>

Which indeed leads us back to the resolutions of yesterday.

This does not necessarily mean that the TOC is mandatory (although I am personally in favor of doing that), it just says that there are some use cases that cannot be done without a TOC. In any case, wherever we end with this, some notes making this clear should also be added to the (Publication Manifest) spec.

larscwallin commented 2 years ago

I really think we need to be very careful of ambiguity at this stage in the specification rollout. It is better to be clear and concise, and make sure that the spec is easy to use for producers and audiobook app implementors, than to be too general. I beliave that it would be benificial to make the TOC mandatory for clarity. And also actually, more importantly, it is the right think to do from an accessibility standpoint.