Closed HadrienGardeur closed 5 years ago
With quick review, this looks like a very good starting point to me.
An intriguing part of the proposal is that the entry page is not required for audiobooks. But the ToC and other navigation lists are currently only defined in this entry page.
Where will they be defined then? in the manifest as as machine readible ToC?
But the ToC and other navigation lists are currently only defined in this entry page.
That's not the case. They're both identified in the manifest and can be included in other resources.
They're both identified in the manifest and can be included in other resources.
True. We can add therefore a feature of the packaging format for audiobooks:
@llemeurfr but that's not specific to audiobooks or packaging them, that's why I don't think it's worth listing.
I'd like to upload an example but unfortunately most audiobooks are too large for our repo. Any suggestions how we should deal with that issue?
cc @iherman @GarthConboy @wareid
Back briefly to the TOC question. Yes, the TOC can be in a non-reading-order resource and referenced from the manifest -- this should work fine. However, it seems we may want some way to identify said resource as ONLY the the TOC (or allow the TOC to be encoded in the manifest), such that the UA/RS knows that said TOC-resource is not really an ancillary resource to be side-presented with the audio, it's just the machine processable TOC.
In practice, said side-presented resources (supplemental content) will likely be PDF's, but I'm not sure that type should be the key to identification.
I strongly recommend against raw ZIP. There are a number of well known problems with it, which is why all standard ZIP-based packages start by addressing them. Since we don't want to recreate the wheel - I suggest two possible starting points.
Yea, OCF without mimetype (if we're really mad at it) -- to get the charset and file path "fixes" -- would be fine.
How much benefit do we get from using one packaging mechanism for EPUB3, a second packaging mechanism for Audiobooks, and a third packaging mechanism for the packaged version of WP? Can we just use OCF until we figure this out for everything?
Touché -- I have to say I'm less mad at mimetype than others. :-)
@GarthConboy
Yes, the TOC can be in a non-reading-order resource and referenced from the manifest -- this should work fine. However, it seems we may want some way to identify said resource as ONLY the the TOC [...]
Well, we already have a rel
value to indicate that the resource contains the TOC. If we go down the dual-approach for the TOC that I've suggested in #350, it will be even more clear that this is a document primarily meant to be processed rather than rendered.
[...] (or allow the TOC to be encoded in the manifest)
That's a different story altogether. We could use JSON of course, but I would advise against doing that just for audiobooks.
If you'd like to illustrate the difference:
@lrosenthol @dauwhe aside from the restrictions on file names, could you list the other benefits of using OCF?
We clearly don't need the mimetype
file or META-INF/container.xml
, yet they're both required in OCF.
You'll find here an ISO standard which specifies a zip profile that could certainly do what we need. Or being the base for a profile we can define (re. filenames constraints) as compatible with OCF, without the XML part.
It explicitly references EPUB OCF in a section about file names and interoperability (annex B).
Note that an alternative is to define and "OCF light", keeping only the OCF Zip Container (section 4), but removing the mediatype file section (4.3), keeping also the File Names section (3.4).
I like the Signature feature, but it may belong to another specification. Or we may decide to keep it also in such "OCF light".
I was not recommending OCF - I recommended OPC.
However, creating an "OCF light" (or simply updating OCF!) would also be fine with me. As you note, the main issues is removing (or more specifically deprecating and/or making optional) the mediatype. You need all the stuff about filenaming - UTF8, restricted chars, etc.
Signatures are good and we should keep them.
This way, such a package (to @dauwhe's concerns) is compatible with EPUB 3.
On Thu, Oct 25, 2018 at 5:11 PM L. Le Meur notifications@github.com wrote:
Note that an alternative is to define and "OCF light", keeping only the OCF Zip Container (section 4), but removing the mediatype file section (4.3), keeping also the File Names section (3.4).
I like the Signature feature, but it may belong to another specification. Or we may decide to keep it also in such "OCF light".
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-433089555, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNaapNPFcyZFwghZ98RD4HtrEq5guks5uodSLgaJpZM4X2INH .
Whatever the choice is for the editing of this spec, we should have a way to validate such packaging. I'm sure they are useful pieces in epubcheck. But are there other pieces of code that would be helpful?
I am uneasy about some aspects of the proposal. In our terminology a Web Audiobook is a special Web Publication, and a packaged version thereof is a special version EPUB4 or PWP (whatever the terminology we use, let us forget about that issue for the moment). Viewing it this way, this proposal sets a precedence that may, on long term, unduly influence how a future packed version of a WP may be. What I find questionable are:
- the manifest has a well-known location at the root of our package: manifest.jsonld
- we drop the requirement for an entry page and its reference in the manifest
We essentially throw away what I consider to be an essential element of flexibility we have in a Web Publication, creating a fairly strong bifurcation in our specs. After all, I could (maybe naïvely) imagine an audiobook consisting of an HTML file containing a TOC, whose entries are a series of HTML audio elements...
@HadrienGardeur
I'd like to upload an example but unfortunately most audiobooks are too large for our repo. Any suggestions how we should deal with that issue?
How big? Can't you put it somewhere on the cloud with a stable URL? If necessary, I can push it up on the W3C web site (but if it is big, I would have to do it while I am at the institute with a big enough bandwidth).
We essentially throw away what I consider to be an essential element of flexibility we have in a Web Publication, creating a fairly strong bifurcation in our specs.
@iherman
Sorry Ivan, but I have to strongly disagree with you here. In a package, we always need to have at least one well-known location. How is that throwing away an element of flexibility?
There's a big difference between dropping the requirement for an entry page and saying that it's actually forbidden. If you still want an entry page in your packaged publication, you'd be allowed to do that.
The entry page is primarily meant to:
In the case of packaged publications, we don't need such things IMO.
After all, I could (maybe naïvely) imagine an audiobook consisting of an HTML file containing a TOC, whose entries are a series of HTML audio elements...
Is there anything in the proposal restricting you from doing that? I don't think so.
After discussing briefly with @iherman, it seems that he's more comfortable with having a well-known location for both:
manifest.jsonld
index.html
This would make it easier to create "single resource in the reading order" publications where the manifest is embedded in index.html
.
This doesn't really change my mind about making the entry page optional rather than required but I think it's a good compromise overall.
Re. an entry page, optional, as index.html: I join such a compromise.
It is not a compromise, it is a consensus:-)
I just thumbs up-ed the above... with the view that the entry page would be optional at least for audiobooks... just checking, is that the consensus?
I've tweaked the first post and added the well-known location for the entry page as well, this way we have a full list for the proposal which could be discussed in a future WG call.
@HadrienGardeur just to be clearer:
we drop the requirement for an entry page and its reference in the manifest
the entry page, if present, must have the same structure than in the WP, ie, it must have a reference to the manifest, or may embed it. What is proposed to be dropped is the requirement for the very existence of the entry page, not its structure.
@iherman
I'm certainly not suggesting that the entry page should be structured differently.
What I'm saying is that:
url
term in the manifest for a packaged publication (this affects the JSON Schema for the manifest)How big? Can't you put it somewhere on the cloud with a stable URL? If necessary, I can push it up on the W3C web site (but if it is big, I would have to do it while I am at the institute with a big enough bandwidth).
@iherman
I'd rather upload the example somewhere in the cloud that's not tied to any of my personal accounts, since someone else than me might need to update it.
The packaged version of Flatland should be roughly 240-250 Mb.
@HadrienGardeur that is fine, but at least temporarily you will have to put it somewhere on the cloud, because I would expect email clients to have problems with such an attachment. Once I get hold of the file, I can push it up on W3C at some www.w3.org/2018/11/XXX URL, which can be then changed later (by my or some other team member) if necessary.
Besides suitable storage / bandwidth, my minimal requirements for hosting sample Web Publications would be:
Access-Control-Allow-Origin
= *
any origin, and if possible Access-Control-Allow-Methods
with HEAD
(and GET
obviously) so that a reading system can get basic info before issuing a GET
request to fetch / incrementally stream the response payload, and also Access-Control-Allow-Headers
+ Access-Control-Expose-Headers
with useful HTTP headers such as Content-Type
, Content-Length
, Accept-Ranges
, Content-Range
, Range
, Link
, Transfer-Encoding
...@danielweck you are raising a more general issue. Do we want to establish a storage for sample Web Publications in general? If so, I would have to look for a dedicated URL rather than the catch-for-all /2018/11/ bin of the W3C web space.
I do have the possibility to set .htaccess
files for CORS on w3.org, so that should be o.k., provided somebody provides me with the correct statements. I must admit I do not know whether our server does that partial request for HTTP 1.1; a question to our system guys...
(B.t.w., github would not give these possibilities, even if the limit was not 100MB as it is now.)
Yes, GitHub's gh-pages
(or any branch mapped as "publishing source") only offers basic static hosting, thus why people have been using CDN proxies like https://rawgit.com (now deprecated), https://www.staticaly.com/rawgit , https://raw.githack.com , https://gitcdn.link etc.
Could it be that only the large audio/video files need to be hosted some place else? It would be nice if other resource types in sample Web Publications (e.g. JSON manifest, HTML, CSS, Javascript, etc.) could be tracked in Git, just like regular source code.
PS - just out of interest, I checked the HTTP headers provided by the various aforementioned CDN proxies, when requesting an MP3 file from the IDPF EPUB3 samples:
=>
curl -I -X GET -L https://raw.githubusercontent.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://cdn.staticaly.com/gh/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://raw.githack.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://gitcdn.link/repo/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://rawgit.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
(deprecated)
...all of them provide accept-ranges: bytes
and access-control-allow-origin: *
, but no sign of the other nice-to-have CORS headers mentioned in my previous message. So yeah, being able to control this with .htaccess
is a bonus :)
On 5 Nov 2018, at 17:48, Daniel Weck <notifications@github.com mailto:notifications@github.com> wrote:
Yes, GitHub's gh-pages (or any branch mapped as "publishing source") only offers basic static hosting, thus why people have been using CDN proxies like https://rawgit.com https://rawgit.com/ (now defunct), https://www.staticaly.com/rawgit https://www.staticaly.com/rawgit , https://raw.githack.com https://raw.githack.com/ , https://gitcdn.link https://gitcdn.link/ etc.
Could it be that only the large audio/video files need to be hosted some place else? It would be nice if other resource types in sample Web Publications (e.g. JSON manifest, HTML, CSS, Javascript, etc.) could be tracked in Git, just like regular source code.
That should certainly be the case for WP examples. But Hadrien's one is an example for a packaged audiobook…
Ivan
PS - just out of interest, I checked the HTTP headers provided by the various aforementioned CDN proxies, when requesting an MP3 file from the IDPF EPUB3 samples:
https://github.com/IDPF/epub3-samples/blob/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 https://github.com/IDPF/epub3-samples/blob/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 =>
curl -I -X GET -L https://raw.githubusercontent.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 https://raw.githubusercontent.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://cdn.staticaly.com/gh/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 https://cdn.staticaly.com/gh/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://raw.githack.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 https://raw.githack.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://gitcdn.link/repo/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 https://gitcdn.link/repo/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC
curl -I -X GET -L https://rawgit.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 https://rawgit.com/IDPF/epub3-samples/master/30/cc-shared-culture/EPUB/audio/asharedculture_soundtrack.mp3 | grep -i ACC (deprecated)
...all of them provide accept-ranges: bytes and access-control-allow-origin: *, but no sign of the other nice-to-have CORS headers mentioned in my previous message. So yeah, being able to control this with .htaccess is a bonus :)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-435946991, or mute the thread https://github.com/notifications/unsubscribe-auth/AAfyE7runbwMClt9XNmI0eOUE13-Pav2ks5usGvcgaJpZM4X2INH.
Ivan Herman, W3C Publishing@W3C Technical Lead Home: http://www.w3.org/People/Ivan/ http://www.w3.org/People/Ivan/ mobile: +31-641044153 ORCID ID: https://orcid.org/0000-0003-0782-2704 https://orcid.org/0000-0003-0782-2704
@HadrienGardeur, your example file is publicly available at:
https://www.w3.org/2018/audiobook_examples/flatland.audiopub
Thanks @iherman for the upload!
For the record, here's the content of that file:
unzip -v flatland.audiopub
Archive: flatland.audiopub
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
1650 Defl:N 522 68% 11-06-2018 11:43 4c434f37 manifest.jsonld
5011 Defl:N 936 81% 09-20-2018 18:48 6a7ad6da toc.html
96193 Defl:N 79951 17% 11-05-2018 15:22 0a773389 cover.jpg
21948718 Stored 21948718 0% 11-05-2018 15:24 b29de8be flatland_1_abbott.mp3
26706222 Stored 26706222 0% 11-05-2018 15:24 ed3ef3d7 flatland_2_abbott.mp3
24105262 Stored 24105262 0% 11-05-2018 15:24 a9bf2144 flatland_3_abbott.mp3
28776750 Stored 28776750 0% 11-05-2018 15:24 29d90755 flatland_4_abbott.mp3
19605806 Stored 19605806 0% 11-05-2018 15:25 77346375 flatland_5_abbott.mp3
26558766 Stored 26558766 0% 11-05-2018 15:25 fdd624a1 flatland_6_abbott.mp3
34345262 Stored 34345262 0% 11-05-2018 15:25 92c51d09 flatland_7_abbott.mp3
42600750 Stored 42600750 0% 11-05-2018 15:25 99a505ee flatland_8_abbott.mp3
18837806 Stored 18837806 0% 11-05-2018 15:25 5a13a4d9 flatland_9_abbott.mp3
-------- ------- --- -------
243588196 243566751 0% 12 files
Typo in .htaccess
?
accept-language: bytes
should be accept-ranges: bytes
, I think.
Also, there is no access-control-allow-origin: *
header.
curl -I -X GET https://www.w3.org/2018/audiobook_examples/flatland.audiopub
HTTP/2 200
date: Tue, 06 Nov 2018 10:35:35 GMT
last-modified: Tue, 06 Nov 2018 09:19:33 GMT
etag: "e8490c3-579fb7fe4b340"
accept-language: bytes
content-length: 243568835
cache-control: max-age=21600
expires: Tue, 06 Nov 2018 16:35:35 GMT
strict-transport-security: max-age=15552000; includeSubdomains; preload
content-security-policy: upgrade-insecure-requests
@danielweck I did not set the .htaccess
at all, so this is whatever the directory inherits from the default setup.
If you can give me a .htaccess content that we would like to have, I would appreciate it... (and use it:-)
@danielweck I think you're raising an important point about WP that is getting lost in this discussion.
I don't know if the specifics about CORS and range requests should show up in our spec or in a best practice document, but we definitely need them somewhere.
Would you mind opening a new issue specifically about that?
ISO has already standardized what we need for "OCF light" which means that we can simply leverage that: http://standards.iso.org/ittf/PubliclyAvailableStandards/c060101_ISO_IEC_21320-1_2015.zip
Credits to @llemeurfr for identifying that document.
Hmmm... interesting re "OCF light" and ISO. We'd need to define a known name for the manifest file, then that may be all we need (well plus likely a file extension and MIME type).
Hmmm... interesting re "OCF light" and ISO. We'd need to define a known name for the manifest file, then that may be all we need (well plus likely a file extension and MIME type).
Sounds like a 2 pages long spec to me (the whole ISO thing being a 10 pages long document in the first place).
As discussed in Audio TF on Nov 16th, I've added this to the queue to be discussed in the main PWG call in the coming weeks RE: the implications of potentially introducing a new packaging format to WP.
I would strongly recommend against using ISO 21320 for your package normative reference for three main reasons.
1 - It doesn't properly address various well known file naming situations (eg. proper Unicode and platform incompatibilities) which OCF/UCF do. 2 - It disallows encryption, which would not be good for those publishers requiring some form of DRM 3 - It disallows DigSig, which would prevent proper tamper detection.
Instead, I would recommend making the necessary changes to OCF - or Adobe would be happy to return our (licensed from IDPF) OCF-variant (called UCF) which already has the few changes you'd probably want to make to OCF anyway (eg. removing the mimetype file restrictions)
On Fri, Nov 16, 2018 at 10:30 AM Hadrien Gardeur notifications@github.com wrote:
ISO has already standardized what we need for "OCF light" which means that we can simply leverage that: http://standards.iso.org/ittf/PubliclyAvailableStandards/c060101_ISO_IEC_21320-1_2015.zip
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-439429556, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNSqB2G40sWlM0Ke0zdW7qBr9Ps4kks5uvtomgaJpZM4X2INH .
Hi Léonard,
Your second point is void: the ISO standard only disallows the encryption mechanism embedded in the Zip format, it does NOT disallow other encryption mechanisms, therefore does not disallow DRM. Same for the third point IMO.
Re. the first point, this interesting, can you detail the issue? Also, the Adobe OCF-variant may be interesting for completing OCF lite. Where can we find the spec?
Le 18 nov. 2018 à 16:22, Leonard Rosenthol notifications@github.com a écrit :
I would strongly recommend against using ISO 21320 for your package normative reference for three main reasons.
1 - It doesn't properly address various well known file naming situations (eg. proper Unicode and platform incompatibilities) which OCF/UCF do. 2 - It disallows encryption, which would not be good for those publishers requiring some form of DRM 3 - It disallows DigSig, which would prevent proper tamper detection.
Instead, I would recommend making the necessary changes to OCF - or Adobe would be happy to return our (licensed from IDPF) OCF-variant (called UCF) which already has the few changes you'd probably want to make to OCF anyway (eg. removing the mimetype file restrictions)
On Fri, Nov 16, 2018 at 10:30 AM Hadrien Gardeur notifications@github.com wrote:
ISO has already standardized what we need for "OCF light" which means that we can simply leverage that: http://standards.iso.org/ittf/PubliclyAvailableStandards/c060101_ISO_IEC_21320-1_2015.zip
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-439429556, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNSqB2G40sWlM0Ke0zdW7qBr9Ps4kks5uvtomgaJpZM4X2INH .
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-439700768, or mute the thread https://github.com/notifications/unsubscribe-auth/AOQD0vvmZfPsc8DBrN2E95ibsp3v-cWkks5uwXszgaJpZM4X2INH.
You are correct - you could certainly encrypt and/or sign using alternative mechanisms inside the ZIP that don't use the native mechanism. However, doing so would introduce security holes in both (but that's another thread).
Right now, it's internal to Adobe - but I'll get clearance to distribute to this WG.
Leonard
On Sun, Nov 18, 2018 at 10:34 AM L. Le Meur notifications@github.com wrote:
Hi Léonard,
Your second point is void: the ISO standard only disallows the encryption mechanism embedded in the Zip format, it does NOT disallow other encryption mechanisms, therefore does not disallow DRM. Same for the third point IMO.
Re. the first point, this interesting, can you detail the issue? Also, the Adobe OCF-variant may be interesting for completing OCF lite. Where can we find the spec?
Le 18 nov. 2018 à 16:22, Leonard Rosenthol notifications@github.com a écrit :
I would strongly recommend against using ISO 21320 for your package normative reference for three main reasons.
1 - It doesn't properly address various well known file naming situations (eg. proper Unicode and platform incompatibilities) which OCF/UCF do. 2 - It disallows encryption, which would not be good for those publishers requiring some form of DRM 3 - It disallows DigSig, which would prevent proper tamper detection.
Instead, I would recommend making the necessary changes to OCF - or Adobe would be happy to return our (licensed from IDPF) OCF-variant (called UCF) which already has the few changes you'd probably want to make to OCF anyway (eg. removing the mimetype file restrictions)
On Fri, Nov 16, 2018 at 10:30 AM Hadrien Gardeur < notifications@github.com> wrote:
ISO has already standardized what we need for "OCF light" which means that we can simply leverage that:
http://standards.iso.org/ittf/PubliclyAvailableStandards/c060101_ISO_IEC_21320-1_2015.zip
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-439429556, or mute the thread < https://github.com/notifications/unsubscribe-auth/AE1vNSqB2G40sWlM0Ke0zdW7qBr9Ps4kks5uvtomgaJpZM4X2INH
.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/w3c/wpub/issues/352#issuecomment-439700768>, or mute the thread < https://github.com/notifications/unsubscribe-auth/AOQD0vvmZfPsc8DBrN2E95ibsp3v-cWkks5uwXszgaJpZM4X2INH .
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/352#issuecomment-439701600, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNQvXOZ88Os3binUAQ-Q7RBsGXfAWks5uwX4KgaJpZM4X2INH .
@danielweck I have added the .htaccess
file to the audiobook example directory:
<Files ~ "\.audiobook$">
Header set Access-Control-Allow-Origin "*"
</Files>
But I am not sure about the Accept-ranges
thing. My understanding of the relevant http section is that it expresses a specific capability of the server, but how do I know whether the server running at W3C has it? Or is it a default behaviour for all Apache servers?
Thanks Ivan.
If I understand correctly, the primary / expected use-case for packaged (i.e. zipped) audio books is for a "reading system app" to fetch the HTTP URL (i.e. download the entire payload), and to store the file locally in some app-managed space (at which point the publication can be unzipped on a filesystem, or accessed directly in its deflated form). Unless the intention is also to allow this scenario "on the web" / in vanilla web browsers (for example: an offliner Service Worker caches the entire ; potentially-large ; *.audiopub
asset, or a Javascript program unzips publication resources on-the-fly directly from the URL that references the packaged / zipped audio book) ... then the "CORS" and "range" HTTP headers are not necessary.
However, if the intention is to serve "exploded" audio book web publications from the https://www.w3.org/2018/audiobook_examples/
URL, then both "CORS" and "range" HTTP headers are required.
Let's check:
curl -I -X GET https://www.w3.org/2018/audiobook_examples/flatland.audiopub
==>
HTTP/2 200
date: Mon, 19 Nov 2018 09:08:37 GMT
last-modified: Tue, 06 Nov 2018 10:55:18 GMT
etag: "e84906f-579fcd6527180"
accept-language: bytes
content-length: 243568751
cache-control: max-age=21600
expires: Mon, 19 Nov 2018 15:08:37 GMT
strict-transport-security: max-age=15552000; includeSubdomains; preload
content-security-policy: upgrade-insecure-requests
...no sign of Access-Control-Allow-Origin
, and accept-language: bytes
still doesn't make sense to me :)
Note that this seems to be a HTTP2 server.
Conversely, see this other media.w3.org
video URL which seems to respond from an HTTP1.1 server (also note the appropriate Accept-Ranges: bytes
header):
curl -I -X GET https://media.w3.org/2010/05/sintel/trailer.mp4
==>
HTTP/1.1 200 OK
Date: Mon, 19 Nov 2018 09:29:20 GMT
Server: Apache/2.4.25 (Debian)
Last-Modified: Thu, 13 May 2010 17:49:03 GMT
ETag: "42b795-4867d5fcac1c0"
Accept-Ranges: bytes
Content-Length: 4372373
Cache-Control: max-age=21600
Expires: Mon, 19 Nov 2018 15:29:20 GMT
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Content-Type: video/mp4
@danielweck the first curl result was my mistake; the extension I used in the .htaccess
was wrong. It should be o.k. now for the audibook should be o.k. now with respect to CORS.
As I said, I have no idea what this accept-language: bytes
is; I suspect it is a central apache setup problem. I have not touched that one.
That being said, I believe that the current directory was setup for packaged audio books examples only, at least for now. So I would say let us leave it for now as is, and we can come back to this if we get to other types of examples.
If I understand correctly, the primary / expected use-case for packaged (i.e. zipped) audio books is for a "reading system app" to fetch the HTTP URL [...]
@danielweck I don't think that's necessarily the "primary" use case. It seems that for some members of this WG (including @GarthConboy), the primary use case is to standardize an ingestion format rather than an end-user format.
IMO a packaged audiobook should handle both use cases.
Unless the intention is also to allow this scenario "on the web" / in vanilla web browsers [...]
That's not a use case, WP serves that purpose, not the packaged version.
However, if the intention is to serve "exploded" audio book web publications from the https://www.w3.org/2018/audiobook_examples/ URL, then both "CORS" and "range" HTTP headers are required.
While the manifest itself would require specific headers for CORS, that's not the case for the audio resources as long as you use <audio>
. But support for range request is indeed a must have for audio resources.
Just a note about "ingestion format" vs. "end-user format": on multiple occasions I heard the term "distribution format" used to describe what I personally interpret as a B2B "interchange format". This notion of "distribution" really depends on "who distributes to whom", it's a question of perspective :) (same with "delivery format")
So, this kind of terminology can easily be misconstrued if we don't define the context carefully, and some of us might get lost in translation during our conversations. There are quite a few intermediaries along the digital supply chain (content creation / authoring, publishers, libraries, accessibility remediation, reading systems, etc.). I'm no expert, but I imagine that audio books production + distribution (that word again!) involve a very different workflow than ; say ; trade e-books, comic books, scholarly publications, etc. (which is why we're discussing TOC and packaging issues, notably)
So, as we aim to clarify use-cases specifically for packaged audio books (e.g. "ingestion" / "interchange") vs. generic packaged web publications (e.g. "delivery" / "distribution"), let's also try to disambiguate the terminology :)
As a follow-up to our discussions at TPAC, I'd like to submit a first proposal for what could become the packaging format for audiobooks:
manifest.jsonld
index.html
readingOrder
in our manifest are considered part of the resource list