w3c / wpub

W3C Web Publications
https://w3c.github.io/wpub/
Other
78 stars 19 forks source link

Packaging for audiobooks #352

Closed HadrienGardeur closed 5 years ago

HadrienGardeur commented 6 years ago

As a follow-up to our discussions at TPAC, I'd like to submit a first proposal for what could become the packaging format for audiobooks:

HadrienGardeur commented 6 years ago

@danielweck I agree that it can become difficult to follow such discussions and for audiobooks in particular there's IMO a lot of bias due to how things are deployed right now by large companies (each of them inventing their own format when delivering content to users).

IMO we should align this packaged audiobook format with the same use cases as EPUB.

Reading an EPUB in a Web app is IMO a mistake: that's not something that EPUB was designed for and you need to jump through many hoops to achieve something that works decently. That's why I would recommend that we exclude such a use case for packaged audiobooks (at least as long as we don't have Web Packaging ready), since this can be handled in a much better way by WP.

llemeurfr commented 5 years ago

@lrosenthol could you please add details from your comment https://github.com/w3c/wpub/issues/352#issuecomment-439700768 ?

What are the character issues treated in OCF and not in the ISO std? what what is this signature issue?

dauwhe commented 5 years ago

There is an existing audiobook packaging format, M4B. It supports a cover image, some metadata, track names, etc.

lrosenthol commented 5 years ago

Good catch on that, @dauwhe. More info on .m4b on wikipedia at https://en.wikipedia.org/wiki/MPEG-4_Part_14

lrosenthol commented 5 years ago

@llemeurfr The filename issue can be found in the OCF spec at http://www.idpf.org/epub/31/spec/epub-ocf.html#sec-container-filenames where it goes into details on a subset of valid names, which must be encoded as UTF8

For the DigSig issue, see section 5.2 of ISO 23120 where it discusses differences from the ZIP Appnote including not supporting ZIP's native DigSig. (but as with encryption, yes, you could use your own)

iherman commented 5 years ago

There is an existing audiobook packaging format, M4B. It supports a cover image, some metadata, track names, etc.

Is there a description of that file format? All pages that I stumbled into are very superficial, and I want to know whether it is really "just" a file format that can contain anything that we define, or whether we are forced to abide to some restrictions.

Also, it worries me that the standard itself is, as often with ISO documents, behind a paywall. I do not think it would go down well when all other standards we use and refer to are available for free.

dauwhe commented 5 years ago

Is there a description of that file format? All pages that I stumbled into are very superficial, and I want to know whether it is really "just" a file format that can contain anything that we define, or whether we are forced to abide to some restrictions.

It is indeed frustrating trying to find out more about this. But I think this is work we should do—the fact there is an existing audiobook standard demands close examination. I was able to make an audiobook in this format with an existing program (which admittedly cost me $US 5.26), and it worked perfectly in iTunes. Attempts to install command-line tools to examine the format directly have so far failed (my operating system is too old for Homebrew).

Also, it worries me that the standard itself is, as often with ISO documents, behind a paywall. I do not think it would go down well when all other standards we use and refer to are available for free.

That's also maddening. But EPUB normatively references ISO 8601, which costs CHF 138! HTML normatively references ISO 3166, which costs a mere CHF 38.

We should talk to David Singer about this; he mentioned the use of MPEG as a packaging technology at TPAC Lyon.

lrosenthol commented 5 years ago

@dauwhe the stuff that David Singer mention is MPEG Part 12, while the audiobook stuff is using MPEG Part 14. Related, yes, but not exactly the same.

iherman commented 5 years ago

We should talk to David Singer about this; he mentioned the use of MPEG as a packaging technology at TPAC Lyon.

Actually, I saw his name appearing on one of the documents around MPEG as editor (or something similar) so he can certainly be very helpful with this.

@TzviyaSiegman, he is an AB buddy, right?

llemeurfr commented 5 years ago

MPEG-4 Part 14 specs the .mp4 (or .m4a or .m4b for Apple + audio (+ bookmarks)) file format. We are looking for a packaging format which can contain multiple mp4 files, with WP defined metadata ... not of stream of media objects with a few mpeg defined metadata (or XMP metadata). -> Not the same logical level.

murata2makoto commented 5 years ago

I would strongly recommend against using ISO 21320 for your package normative reference for three main reasons.

@lrosenthol

Although I am not sure if we should reference ISO 21320, I would like to make corrections.

21320 is nothing but PWARE ZIP except those features requiring license fees.

1 - It doesn't properly address various well known file naming situations (eg. proper Unicode and platform incompatibilities) which OCF/UCF do.

Surely, it does not. Neither OCF or UCF do. Unforunately, it is too late to fix ZIP implementations. More about this, see the annex of ISO 21320.

2 - It disallows encryption, which would not be good for those publishers requiring some form of DRM

ISO/IEC 21320 merely disallows PKWARE encryption, which uses license fees. OCF does the same thing. It is certainly possible to add encryption on top of 21320.

3 - It disallows DigSig, which would prevent proper tamper detection.

Again, I do not think that this is true. Digital signatures can be added on top of 211320.

danielweck commented 5 years ago

MPEG-4 Part 14 specs the .mp4 (or .m4a or .m4b for Apple + audio (+ bookmarks)) file format. We are looking for a packaging format which can contain multiple mp4 files, with WP defined metadata ... not of stream of media objects with a few mpeg defined metadata (or XMP metadata). -> Not the same logical level.

Agreed, but shouldn't we also document the rationale for the in/out-of-scope status of the m4b format, based on its merits/drawbacks, as per the use cases defined for (audio) Web Publications? This is important because from a UX perspective, there isn't a significant functional gap (at least on the surface). For example: https://player.cantookaudio.com/aHR0cHM6Ly9yZWFkaXVtLm9yZy93ZWJwdWItbWFuaWZlc3QvZXhhbXBsZXMvRmxhdGxhbmQvbWFuaWZlc3QuanNvbg== => this "looks" exactly like your run-off-the-mill m4b audiobook player, yet it is actually based on ReadiumWebPubManifest (if I am not mistaken).

HadrienGardeur commented 5 years ago

For example: https://player.cantookaudio.com/aHR0cHM6Ly9yZWFkaXVtLm9yZy93ZWJwdWItbWFuaWZlc3QvZXhhbXBsZXMvRmxhdGxhbmQvbWFuaWZlc3QuanNvbg== => this "looks" exactly like your run-off-the-mill m4b audiobook player, yet it is actually based on ReadiumWebPubManifest (if I am not mistaken).

I'm not entirely sure which point you're trying to make @danielweck but to provide additional context:

This can be a good example to illustrate how an audiobook can be published as a WP, but it doesn't feel relevant to me in a discussion about packages.

danielweck commented 5 years ago

Sorry to have briefly digressed into the UX perspective, but I imagine that stakeholders in the audiobook business will need convincing that "packaged web audio publications" solves a problem they can't already address with the m4b format. I appreciate that this is a "meta" level concern, and of course I am also well aware that m4b cannot be used to container-ize Web Publication resources without significant, non lossless transformations. So my point was that we should not dismiss an established technology without explaining why. PS: coincidentally, I recently purchased a 15h audio book (companion to a hardcover book) in m4b format with chaptering, cover image, metadata, etc.

HadrienGardeur commented 5 years ago

Most container formats that I'm aware of for audio/video tend to be specifically tied to a file format and/or codec.

Can you package Opus files in an M4B for example?

llemeurfr commented 5 years ago

@daniel, m4b is an Apple extension of the mp4 format. Not a standard. Only Apple players use its extended features.

Add the metadata we aim to provide, and the fact that an m4b is ONE big file (vs multiple audio files for audiopub, easier to produce maybe), like a publication is not only one huge HTML file. But this is an advantage for the producer, not the user.

danielweck commented 5 years ago

Only Apple players use its extended features.

I am not an Apple customer and I read audiobooks in m4b format, so maybe the adoption of this format not confined to the Apple ecosystem?

As I said, the transformation from Audio-Web-Publications to m4b would not be lossless, and anyway this is certainly not something I am advocating right now. I am merely reporting the fact that ; as others have done ; this format exists and seems to appeal (perhaps "by default") to some publishers. I am sure that Packaged-Audio-Web-Publications will be better ;)

lrosenthol commented 5 years ago

Makoto-san, OCF and UCF both have requirements and restrictions concerning the naming of files in the ZIP central directory. that is what I am referring to, that we most certainly want in any ZIP-based package format we would use.

On Wed, Dec 12, 2018 at 8:04 PM MURATA Makoto notifications@github.com wrote:

I would strongly recommend against using ISO 21320 for your package normative reference for three main reasons.

@lrosenthol https://github.com/lrosenthol

Although I am not sure if we should reference ISO 21320, I would like to make corrections.

21320 is nothing but PWARE ZIP except those features requiring license fees.

1 - It doesn't properly address various well known file naming situations (eg. proper Unicode and platform incompatibilities) which OCF/UCF do.

Surely, it does not. Neither OCF or UCF do. Unforunately, it is too late to fix ZIP implementations. More about this, see the annex of ISO 21320.

2 - It disallows encryption, which would not be good for those publishers requiring some form of DRM

ISO/IEC 21320 merely disallows PKWARE encryption, which uses license fees. OCF does the same thing. It is certainly possible to add encryption on top of 21320.

3 - It disallows DigSig, which would prevent proper tamper detection.

Again, I do not think that this is true. Digital signatures can be added on top of 211320.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-446803825, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNbZVSOk13uEPgBe1kf1pzXONAm7Uks5u4aeHgaJpZM4X2INH .

HadrienGardeur commented 5 years ago

OCF and UCF both have requirements and restrictions concerning the naming of files in the ZIP central directory.

Same thing for ISO 21320, check Annex B which even references OCF, UPC and UCF.

dwsinger commented 5 years ago

I will find out what happens in m4b, for sure. The offer I made was of the HEIF format, which is a specialization of the formats developed for MP4 (widely used) and MPEG-21 (unused), with the latter being re-purposed to carry images rather than 'digital items'.

The attached slides are what I hoped to show at TPAC. In essence, HEIF allows the storage of 'items', which have types (4 character code, or MIME types), simple names (which can be used in relative URLs, 'as if' the item were a separate file coming from the same place as the package), references (typed, directional, so one can see dependency), and identifies a 'primary item' entry point (e.g. the main HTML page, for this case). HEIF is a moderately abstract base-layer on which building a modern image file format was surprisingly easy; and it allows for both timed and untimed material in the same package.

If we could combine this with something the visual publishing world needs -- the ability to cause timed update of the HTML etc. -- I think we might have something very powerful.

It's an offer, and something to be aware of, and I'd be happy to entertain questions or get around a whiteboard.

heif preso.pdf

dauwhe commented 5 years ago

@dwsinger That's really interesting. How you envision this sort of package being consumed? My browsers have no idea what m4b files are, but iTunes is happy to play them. If a packaging format based on HEIF largely contained web content, can you imagine a future where a web browser could display all the content directly?

I met a mountain guide in Canada last winter. He’s created a really complex publication about avalanches. He doesn’t want to distribute it as an ebook, as most reading systems can’t handle the JavaScript, and many end users can’t easily figure out how to obtain a reading system and side-load an EPUB. He just wants something he can email to a person so they can double-click and have it open in a browser. PDF eventually attained that level of ease. Some of us want that for web stuff.

dwsinger commented 5 years ago

I think there may be an opportunity here also for convergence; the media business (videos, audio) are also wanting a packaged interactive format. And there is a product spectrum here -- book, book with embedded audio/video, book with spoken audio, audiobook, TV/video program...

murata2makoto commented 5 years ago

Long time ago, in W3C, there was an attempt to create a ZIP-based package format. In my understanding, it failed because different applications had different priorities.

HadrienGardeur commented 5 years ago

More than two months and 70+ comments later, it doesn't feel like we've made significant progress or diverged from my initial comment.

all resources (including the manifest) are packaged together in a ZIP (a lighter take on OCF)

This seems to be the preferred option in early 2019 as well.

the audio resources should not be further compressed in the ZIP

I haven't seen a single mention regarding compression of resources that are already optimized (audio, video, images). Probably worth considering in these discussions.

the manifest has a well-known location at the root of our package: manifest.jsonld the entry page has a well-known location as well: index.html

Both well-known locations are listed in the OCF Lite draft.

we drop the requirement for an entry page and its reference in the manifest

I think that this is still very relevant. Forcing authors/publishers to create something that they don't need or usually produce is completely counter-productive.

We've been down that road before with EPUB FXL (not allowing images in "spine") and we're still paying the price for this today (with a mix of distributors that pre-process such files and reading apps like Kobo who use a dual-rendering engine approach).

all resources contained in the package that are not listed under readingOrder in our manifest are considered part of the resource list

This hasn't been discussed again. Probably worth considering a bit more (like compression on optimized resources).

we define a dedicated media type (TBD) and file extension (TBD as well) to identify such packages, both of them would be specific to audiobooks only

This is still under discussion.

BigBlueHat commented 5 years ago

Long time ago, in W3C, there was an attempt to create a ZIP-based package format. In my understanding, it failed because different applications had different priorities.

There's also the later attempt (2012) at "Packaged Web Apps" aka widgets: https://www.w3.org/TR/widgets/

Interestingly, Google had Gears, Firefox had Firefox OS packaged apps, but ultimately they've deprecated those in favor of Web-distributed installable Web apps.

Consequently, our distribution and consumption models need analysis as we consider the packaging concerns. Ideally, the manifest and contents of the publication would need no changes when packaged or unpackaged such that publishers can create "a publication" and then determine the best distribution models for their business and content types.

wareid commented 5 years ago

This issue was resolved by a discussion in the meeting on January 28.