Minimum Viable Manifest

TzviyaSiegman commented 7 years ago

Ignoring issues such as location, serialization, etc. What is the minimum viable manifest?

I have extracted requirements from https://github.com/w3c/wpub/issues/6

identifier (tbd what id is) (required)
Identification as WP (required)
list of resources/at least one resource (required)
required order (required)
metadata (there are some discussions about what should be included/required in metadata, but that is a separate issue)
navigation/toc (optional? required?)
language (is this metadata?)
title (is this metadata?)

For more detail see https://github.com/w3c/wpub/issues/6#issue-240715512 https://github.com/w3c/wpub/issues/6#issuecomment-313164848 https://github.com/w3c/wpub/issues/6#issuecomment-313497461 https://github.com/w3c/wpub/issues/6#issuecomment-314486416

HadrienGardeur commented 7 years ago

In Readium Web Publication Manifest, we consider that a minimum viable manifest has:

a title (metadata)
a canonical locator to the manifest (links, using self as a rel)
at least one resource in the reading order (spine in Readium, probably called primary for WP)

Its identification as a WP manifest is handled by a dedicated media type (application/webpub+json), not an element in the document itself.

I know that @lrosenthol has challenged the requirements for both a title and a reading order, but I still feel that this is a good starting point.

For the other things that you've mentioned, I'd say:

identifier (metadata, optional but recommended)
list of secondary resources (probably using secondary, optional but recommended)
navigation (optional but not clear if this means embedding navigation in manifest or identifying which resource contains the navigation)
language (metadata, optional)

IMO none of these other elements are in a minimum viable manifest.

rdeltour commented 7 years ago

From @TzviyaSiegman

identifier (tbd what id is) (required)

replace identifier by URL and I agree. A manifest, de facto, has a URL. Some people will use that as an identifier, some people won't. Some will use something else (an ISBN), but the common ground is that this URL exists and unambiguously locates the manifest.

From @HadrienGardeur / Readium2

a canonical locator to the manifest (links, using self as a rel)

at some point I suppose we'll need to discuss self vs. canonical. The former is used in Atom (and Readium, obviously ;-), the latter I think is more common on the Web. Not sure about the semantic difference.

Also, is the manifest's URL could be interpreted as the canonical locator by default if self/canonical is absent, right?

navigation (optional but not clear if this means embedding navigation in manifest or identifying which resource contains the navigation)

Right. This needs to be discussed, probably in a separate thread.

rdeltour commented 7 years ago

replace identifier by URL and I agree.

As a comparison, note that the manifest object in Web App Manifest do not have an identifier member. But it does have a URL of course:

Every manifest has an associated manifest URL, which is the WHATWG-URL from which the manifest was fetched.

murata2makoto commented 7 years ago

Is this issue intended provide fine details of manifest? Or is it rather intended to provide a high-level overview, which is needed for the discussion of desiderata? I do not want to nail down the design of manifest without understanding packaging and unpackaging more.

avneeshsingh commented 7 years ago

"Default" reading order should be essential. Regarding Navigation, as per my understanding, it includes hierarchical structure, and convenient access defined by author for pages, notes etc. The hierarchical map is essential, while convenient access defined by author may be optional.

pkra commented 7 years ago

@avneeshsingh wrote

"Default" reading order should be essential.

Even though I'd consider reading order information important to a publication, I don't think it should be required in a manifest; e.g., it seems sufficient if user agents render the the first (primary) resource when they can't find a reading order (whatever it may be).

HadrienGardeur commented 7 years ago

To answer your questions/points @rdeltour

at some point I suppose we'll need to discuss self vs. canonical. The former is used in Atom (and Readium, obviously ;-), the latter I think is more common on the Web. Not sure about the semantic difference.

self was initially introduced in the Atom spec (RFC 4287):

The value "self" signifies that the IRI in the value of the href attribute identifies a resource equivalent to the containing element.

But the same author (Mark Nottingham) provides an updated definition in Web Linking (RFC 5988, the spec that introduces the Link header to HTTP):

Conveys an identifier for the link's context.

On the other hand, canonical was introduced in RFC 6596, mostly to deal with duplication:

The target (canonical) IRI MUST identify content that is either duplicative or a superset of the content at the context (referring) IRI.

In our case I would argue that from the manifest itself, self is much better suited.

Also, is the manifest's URL could be interpreted as the canonical locator by default if self/canonical is absent, right?

It could, but since the manifest will definitely be distributed in various ways where you won't be dealing with HTTP (for example when the manifest is in a package), you must have another way to provide that URL. For this reason, I'd rather have it as a requirement.

HadrienGardeur commented 7 years ago

@pkra wrote:

Even though I'd consider reading order information important to a publication, I don't think it should be required in a manifest; e.g., it seems sufficient if user agents render the the first (primary) resource when they can't find a reading order (whatever it may be).

IMO the list of primary resources and the reading order should be the same thing.

avneeshsingh commented 7 years ago

“list of primary resources and the reading order should be the same thing.”

So, the order of listing the primary resources becomes the reading order. This is worth considering. However there are complexities. e.g. if 10 html pages are primary resources and 5 of these pages has audio player in it, pointing to mp3 file. Then we will not place this MP3 file as the primary resource?

pkra commented 7 years ago

@HadrienGardeur wrote

IMO the list of primary resources and the reading order should be the same thing.

Interesting thought. I guess I'm not clear on the distinction of primary and secondary. But that's a separate issue.

lrosenthol commented 7 years ago

I agree with @HadrienGardeur https://github.com/hadriengardeur that the primary resources and the reading order are one and the same. I've been trying to come up with an example of a primary resource (as we've been thinking about it) that wouldn't be in the default reading order, and I haven't found one as yet.

I am not clear what the difference is between "navigation" and the "default reading order" - I see them as exactly the same thing. So given the DRO and primary resource alignment above, then we can also merge nav into that as well and kill three birds.

I would group all metadata together (including title and lang) and make it optional. So far, no metadata has been identified as being required for a WP. However, I do believe that as we move to PWP and EPUB4, there will be fields that we will want/need required.

Secondary resource listing gets into some significant complexities (esp. with relative URL resolution) when you consider the publication editing/updating. I would like to avoid this at all costs.

On Fri, Aug 4, 2017 at 6:42 AM, Peter Krautzberger <notifications@github.com

wrote:

@HadrienGardeur https://github.com/hadriengardeur wrote

IMO the list of primary resources and the reading order should be the same thing.

Interesting thought. I guess I'm not clear on the distinction of primary and secondary. But that's a separate issue.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/15#issuecomment-320218058, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNawuUBvNG-zKiPDpbQpMeLTQuVy4ks5sUvWLgaJpZM4OsseF .

rdeltour commented 7 years ago

I am not clear what the difference is between "navigation" and the "default reading order" - I see them as exactly the same thing.

There's more to navigation than just the reading order: navigation is also about ToC (deep links to the content), or page lists, or landmarks, etc. But the details s/b discussed in a separate issue, as it may not belong to the minimum viable manifest.

mattgarrish commented 7 years ago

IMO the list of primary resources and the reading order should be the same thing.

In general I agree, but where does that place "non-linear" resources?

If a resource opens in a new window, is it primary or ... ?

I'm not fully comfortable with the definitions we have, even if they'll do for now. But this is another issue.

lrosenthol commented 7 years ago

On Fri, Aug 4, 2017 at 8:56 AM, Matt Garrish notifications@github.com wrote:

IMO the list of primary resources and the reading order should be the same thing.

In general I agree, but where does that place "non-linear" resources?

What's a non-linear, primary, resource?

If a resource opens in a new window, is it primary or ... ?

For the case of WP, I don't think it matters or that we care. That's up to the author and out of scope.

For PWP, there are significant security concerns with this which we'll address when we get there.

lrosenthol commented 7 years ago

On Fri, Aug 4, 2017 at 8:19 AM, Romain Deltour notifications@github.com wrote:

I am not clear what the difference is between "navigation" and the "default reading order" - I see them as exactly the same thing.

There's more to navigation than just the reading order: navigation is also about ToC (deep links to the content), or page lists, or landmarks, etc.

Thanks. Those are definitely non MVP then, since many publications won't have them.

But the details s/b discussed in a separate issue, as it may not belong to the minimum viable manifest.

Yes, sounds like we need to have discussions about this navigation stuff...and also how much is at the WP level...

avneeshsingh commented 7 years ago

Navigation and default reading order are surely different. One part of Navigation is hierarchy, which is essential. The other part includes navigation for convenience for example page numbers, links to notes etc. This can be optional, and is based on preference of author.

And default reading order cannot even satisfy the essential part, hierarchy, unless there are strict conditions applied on content documents.

TzviyaSiegman commented 7 years ago

@murata0204

Is this issue intended provide fine details of manifest?

This issue is intended to create a working version of the manifest for FPWD. We will (necessarily) refine is as the details of WP emerge.

dauwhe commented 7 years ago

I think the minimum viable manifest would include:

A list of the primary document resources, in default order
a title
some way of declaring that this is a web publication

Lots of other things are nice to have, but it's not impossible to imagine life without them. A web manifest would have a URL, which could serve to identify the web publication. For some cases, perhaps nothing more is needed.

lrosenthol commented 7 years ago

I still continue to object to 2 - titles are optional.

Otherwise, I agree that 1 & 3 are MVP.

On Mon, Aug 7, 2017 at 11:40 AM, Dave Cramer notifications@github.com wrote:

I think the minimum viable manifest would include:

A list of the primary document resources, in default order

a title

some way of declaring that this is a web publication

Lots of other things are nice to have, but it's not impossible to imagine life without them. A web manifest would have a URL, which could serve to identify the web publication. For some cases, perhaps nothing more is needed.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/15#issuecomment-320699295, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNYMWbM_cC3nQB6ih9JR5GfGDTqIvks5sVy_ggaJpZM4OsseF .

HadrienGardeur commented 7 years ago

I still feel that 3 (some way of declaring that this is a web publication) is implicit as long as we have a dedicated media type.

dauwhe commented 7 years ago

I still continue to object to 2 - titles are optional

The minimal file that passes HTML5 validation is <!doctype html><title>Hello world</title>. EPUB requires a title. WCAG single A requires titles for web pages. Docbook requires a title. Saving something to the homescreen (a la web app manifest) requires a name.

For something so fundamental, and so intrinsic to how humans talk about the world, I would want a fairly compelling reason why a web publication shouldn't need a title.

Or, to put it another way, it's easy to imagine untitled documents, but an untitled publication is not yet ready to publish.

deborahgu commented 7 years ago

Accessibility also requires as a title. (WCAG 2.4.2)

HadrienGardeur commented 7 years ago

I'm wondering about our use cases here:

do we expect to distribute the manifest without using HTTP (other than PWP of course)?
if so, could we for example email a manifest to someone? Discover a manifest using some sort of local discovery (on a WiFi, using NFC, pretty much anything)?

Since having a locator is a requirement, if we consider that these are valid use cases, then we also need a locator for the manifest in the minimum viable manifest.

lrosenthol commented 7 years ago

On Mon, Aug 7, 2017 at 12:21 PM, Dave Cramer notifications@github.com wrote:

I still continue to object to 2 - titles are optional

The minimal file that passes HTML5 validation is <!doctype html>Hello world. EPUB requires a title. WCAG single A requires titles for web pages. Docbook requires a title. Saving something to the homescreen (a la web app manifest) requires a name.

Yes, but the title could be empty and still be valid. So what's the point of having it?

I say this from experience with PDF/X where that was exactly what folks did

put in blank titles. So we removed that requirement in later versions of the standard.

Or, to put it another way, it's easy to imagine untitled documents, but an untitled publication is not yet ready to publish.

I disagree. Most documents are actually untitled.

BigBlueHat commented 7 years ago

@lrosenthol actually, HTML5 requires a title not be blank. Try the validator with <!doctype html><title> </title> (or any other variation of "empty"). Without a meaningful title, it's not an HTML document.

Also, @dauwhe's point was that documents are often untitled, but publications only lack titles when unpublished--which I think is a safe assertion.

Just clarifying. 👓

deborahgu commented 7 years ago

Or, to put it another way, it's easy to imagine untitled documents, but an untitled publication is not yet ready to publish. I disagree. Most documents are actually untitled.

That's actually agreeing with @dauwhe, who distinguished between "document" and "publication" as a prescriptive definition.

bduga commented 7 years ago

Regarding the list of primary resources (thank you, @murata0204 for noting that, I had missed it), I think we need to list all the resources that could be used by the publication IF we want to allow packaging of arbitrary WPs. Otherwise it is not just difficult to package, it appears to be impossible in the presence of scripts.

lrosenthol commented 7 years ago

On Mon, Aug 7, 2017 at 12:48 PM, BigBlueHat notifications@github.com wrote:

@lrosenthol https://github.com/lrosenthol actually, HTML5 requires a title not be blank. Try the validator https://validator.w3.org/nu/#textarea with <!doctype html> (or any other variation of "empty"). Without a meaningful title, it's not an HTML document.

That's fine - but that title is on a single content element. It has nothing to do with the "publication" itself.

Also, as I noted, validation isn't useful for ad-hoc publications since users won't validate things...

Also, @dauwhe https://github.com/dauwhe's point was that documents are often untitled, but publications only lack titles when unpublished--which I think is a safe assertion.

For formal publications - I agree. For ad-hoc publications, no it is not.

Which is why I think that something like title an be a requirement for something like EPUB4 (which is more targeted to formal publications - at least if it keeps as EPUB3 is) but not for the generic WP.

HadrienGardeur commented 7 years ago

Regarding the list of primary resources (thank you, @murata0204 for noting that, I had missed it), I think we need to list all the resources that could be used by the publication IF we want to allow packaging of arbitrary WPs. Otherwise it is not just difficult to package, it appears to be impossible in the presence of scripts.

I highly doubt that's doable. IMO we should recommend that the manifest includes an extensive list of secondary resources, but we can't have it as a requirement (this is not EPUB).

lrosenthol commented 7 years ago

I should point out that when you merge/combine multiple (P)WP's together - the title also becomes problematic as well (though I guess a processor could fake something - but that's also to my point :))

deborahgu commented 7 years ago

I'm inclined to the pedantic about Leonard's distinction of "ad hoc publications"; if it is so ad hoc that it does not get a title, then it sounds like it is more of what Dave would refer to as a "document" and not a "publication."

I should point out that when you merge/combine multiple (P)WP's together - the title also becomes problematic as well (though I guess a processor could fake something - but that's also to my point

I rather think it supports the "title can be required" point, actually. Nobody said that the title had to be intellectually creative content generated by a thoughtful human. If the processor creates the title Compilation of TITLE 1, TITLE 2, and TITLE 73, that's a useful and meaningful element in the manifest.

iherman commented 7 years ago

I begin to wonder what we really mean by minimal. If I stay on an abstract level, I would think that issues originally added by @TzviyaSiegman, like the language of the publication, or maybe even the reading order or the TOC are necessary. It is true, however, that in practice, some of these are implicit or default. For example, if I do not define the language on the Web, then it is considered to be set to "en" (or "en-us"?). Ie, I may not have to add it to a minimal manifest in practice but, on the abstract level, it is there. The TOC may be unnecessary if the WP only has one primary resource (a single HTML file) but only because it is implicit and does not have to be made explicit.

I would prefer to go back to the original list of @TzviyaSiegman to look at the abstract level, the "information item" level to see what we need. We can then see what can be considered, in some cases, as easily deductable or defined as default, but these two things are different...

deborahgu commented 7 years ago

I am strongly opposed to removing items from the spec on the grounds that you can't trust users to create documents that are conformant. Either authoring tools enforce conformance, or users create a lot of non-conformant documents. That is the state of HTML,, CSS, WCAG, and every spec ever written.

(I am not opposed to removing items from the specification on the grounds that they are too onerous for users to implement. I just don't think "a publication must have a title" is too onerous.)

HadrienGardeur commented 7 years ago

As I said on the call @iherman, I think that you're making a good point about things that are implicit or explicit but I'm not sure this affects the minimum requirements that much.

Do we absolutely want to require a language or a ToC if there's no implicit way to guess them? I don't think so, they're still important to have but not a requirement IMO.

That said, the difference between implicit and explicit info means that the Readium Web Publication minimum requirements are pretty much aligned with what's proposed so far:

primary resources (explicit for both manifest formats)
WP-ness (implicit in Readium through the media type, undefined yet for WP)
title (explicit for both manifest formats)
locator for the manifest (explicit in Readium using a link, implicit so far in WP?)

Moving forward, I believe that we should make this distinction more often.

lrosenthol commented 7 years ago

for those of you who believe a title is mandatory - why?

What purpose will it serve? Is it for the user or for some processor (eg. UA)? How do you see it being used?

bduga commented 7 years ago

Without a manifest that lists all the resources, we won't be able to take a WP offline which is a requirement of WPs in the charter. Should we remove the offline capability from the charter?

TzviyaSiegman commented 7 years ago

@lrosenthol see https://github.com/w3c/wpub/issues/20 for title discussion

TzviyaSiegman commented 7 years ago

For further discussion of whether the manifest should include metadata, see https://github.com/w3c/wpub/issues/21

baldurbjarnason commented 7 years ago

I agree with @lrosenthol that titles should be optional for two reasons, both of which touch on general design principles for the manifest format (i.e. this observation is about the general structure and design of the manifest and 'title' is primarily an example and the principle applies to much more):

First. As a principle, if you are going to make a hard 'MUST' requirement then you need to either explicitly mandate UA failure when the requirement isn't fulfilled or you need to specify some form of error recovery.

If you look at the W3C specs that have seen wide adoption (HTML, CSS, JSON-LD), you'll find that they document how the UA is supposed to recover when parsing files that contain invalid data or don't contain required data. Or they specify exactly what the failure scenarios are. Sometimes it's just a short piece of text ("ignore bits you don't recognise or can't map onto bits you do recognise"). Sometimes it's a complex and detailed algorithm (HTML).

So if a title is essential to a functioning publication you want to either mandate UA failure in its absence (because it's essential) or specify an error-recovery title discovery algorithm that should be used when there's no title. Error-recovery does not have to be an uncommon scenario. E.g. when it comes to HTML parsing, invalid HTML is expected to be common.

With this design a UA will always respond to the error scenario in a predictable way. If a title isn't essential and the UA can design an acceptable user experience where the property may or may not be present, then it really isn't a 'MUST'.

And yes, this same argument can and should be applied to pretty much anything else people want to make a MUST for manifests, from navigation to secondary resources. This is a general design principle.

MUST, but with error recovery, and SHOULD are also not the same thing. With 'must' the feature in question will always be present in the format or structure once it has been parsed—error recovery during parsing means that either parsing fails or the feature is present. With 'should 'the feature may or may not be present in a parsed document. In HTML, for example, the DOM will always have a certain required structure even when parsing invalid HTML and the spec defines in rigorous detail how a UA should attempt to parse invalid documents.

Once you go down the road of making hard requirements in the manifest that aren't directly related to basic functionality (e.g. without a primary resource there's no 'there' there and the UA needs a WP-ness indicator to trigger WP-behaviour) you end up having to define custom parsing—standard JSON/HTML/whatever parsing won't do. If you don't do this, you are guaranteeing inconsistent behaviour across UAs (like you got with ePub). The alternative is, as stated above, to require explicit failure as XML did and the entire web community has manifestly decided that this is not an acceptable user experience. If we don't want to recreate ePub's dysfunctions, we also have to avoid ePub's design mistakes.

Inconsistent title behaviour may sound like an innocuous problem but this design mistake (i.e. making hard requirements that don't have a basis in either core functionality or in the pre-existing parsing models of the formats you're building on) multiplies overall inconsistent behaviour when repeated throughout a format. It's a recipe for flawed and buggy user agents and negative user experiences.

You can't just say 'this property has to be there because is essential to how publications should/will work' without also defining clear consequences of it not being there.

You can't rely on validators, btw. That has been demonstrated to not work at all for ensuring the validity of documents in use. Most people ignore them completely. Validation needs to be built into the UA's basic handling of the format if it is to be meaningful.

Second. I'm worried about statements such as @dauwhe's about making a hard distinction between publications and documents. All HTML files are documents; some of which may also, in the future, be publications. The two conditions are not mutually exclusive. The conditions are dependant on their context, usage and user perspective and not on inherent characteristics of the requirements of web publications as defined. Once we become too prescriptive about how the format should be used we risk needlessly locking ourselves out from a number of unknown future use cases. There is nothing in the definition or requirements of a web publication that makes it explicitly incompatible with the traditional 'document' use case. I don't see a reason to lock out that potential scenario just because some of its characteristics are seen as culturally inappropriate in publishing.

TzviyaSiegman commented 7 years ago

@baldurbjarnason

As a principle, if you are going to make a hard 'MUST' requirement then you need to either explicitly mandate UA failure when the requirement isn't fulfilled or you need to specify some form of error recovery.

I don't have a problem with that. We are in the early stages of drafting this spec. Would you like to write a proposal for error messaging? See also https://github.com/w3c/wpub/issues/20

I'm worried about statements such as @dauwhe's about making a hard distinction between publications and documents. All HTML files are documents...

I do not wish to speak for @dauwhe, but I think that's precisely what he meant. We are not trying to replace HTML nor dictate use. However, we must keep our primary audience in mind when writing. If we are choosing between two scenarios when making a decision for this, I think we must target those creating publications, not those creating documents.

baldurbjarnason commented 7 years ago

@TzviyaSiegman

I do not wish to speak for @dauwhe, but I think that's precisely what he meant. We are not trying to replace HTML nor dictate use. However, we must keep our primary audience in mind when writing. If we are choosing between two scenarios when making a decision for this, I think we must target those creating publications, not those creating documents.

Every time we make that sort of choice we are distancing web publications from the web platform proper. Every publication will start its life as a document somewhere. It doesn't look productive to me to create arbitrary distinctions that have the potential to disrupt the lifecycle of a publication by breaking its flow from document stage to publication stage. I think we should at least try to avoid creating such roadblocks unless there is a clear and pressing functional argument for doing so.

BillKasdorf commented 7 years ago

On the subject of whether the Minimum Viable Manifest MUST identify secondary resources, I just wanted to point out for the record that I think that's a slippery slope because secondary resources are more likely to be things that can change over time. Since we defined secondary resources as referenced from primary resources, you should always be able to get to the current set of secondary resources without having them listed in the manifest.

bduga commented 7 years ago

@BillKasdorf You cannot. There is no way to crawl a script and find all the resources it might use or cause to be used. If all secondary resources are not listed, it is not possible to cache a WP offline. I think we either list all resources, or abandon deterministic offline caching.

baldurbjarnason commented 7 years ago

@bduga

I interpreted the offline requirement as being a requirement that authors/publishers have to be able to make offline-able web publications, not that every web publication must be offline-able. It sounds like the offline requirement certainly needs to be clarified if understandings of it vary so much.

I don't think requiring all web publications be deterministically cacheable offline (beyond what can be done already with websites) is feasible in general. Making sure it is a capability available to all publishers/authors is reasonable as that's lets publishers make their publication offline-able by listing all of its secondary resources in the manifest which lets the UA do their magic.

baldurbjarnason commented 7 years ago

Moreover, I think that offline caching, by necessity, has to be granular and not publication-wide. The UA will only ever be able to cache resources that are listed in the manifest.

HadrienGardeur commented 7 years ago

@bduga

This is not an "all or nothing" scenario, a UA will be able to cache all the resources listed as primary or secondary, this doesn't mean that the publication will be unusable.

I see it as a feature, not a bug: you don't necessarily want to cache for offline access every resource used by your publication.

Let's say that you're using an analytic service, but it's not designed to work in an offline use case at all. Do you really want to list such a script in your secondary resources? Probably not.

bduga commented 7 years ago

@baldurbjarnason You have to be careful about what you mean by offline. It is the case that not every WP must be packagable (can convert to a PWP), but the charter seems pretty clear about being able to use it offline: "A Web Publication must be available and functional while the user is offline." I don't see a way to interpret that which would allow a WP to be unusable while offline. I do agree with your point, however; for a cache to work publication wide we need to list all resources in the manifest.

bduga commented 7 years ago

@HadrienGardeur In that case, the analytics package is not part of the publication. A manifest is the thing that lets an author specify what they mean by a publication and what should be made available if a user says to make a publication available offline. Not everything linked from a publication is a part of that publication. However, some scripts and their resources are part of the publication, and if not available the publication may be considered unusable.

baldurbjarnason commented 7 years ago

@bduga

@baldurbjarnason You have to be careful about what you mean by offline. It is the case that not every WP must be packagable (can convert to a PWP), but the charter seems pretty clear about being able to use it offline: "A Web Publication must be available and functional while the user is offline." I don't see a way to interpret that which would allow a WP to be unusable while offline. I do agree with your point, however; for a cache to work publication wide we need to list all resources in the manifest.

Yeah, I don't think this is a requirement that can be fulfilled for the publication as a whole—only for the primary resources. No matter what we say in the spec, authors will make publications that are only fully usable when online or whose experience degrades substantially when there is no internet connection. Authors may not even be able to list all of the secondary resources because a script is handling that for them dynamically. Making a requirement people can't follow turns the spec into a fiction and undermines its credibility overall.

What we can do is make the primary resources available offline (as those are listed) and whichever secondary resources are listed in the manifest, if any. That's the core of the publication. The rest will just have to break. We can't force authors to list every resource for every publication. And like I said, in many cases they won't even know what resources the publications will be using as a script might be pulling it in dynamically.

HadrienGardeur commented 7 years ago

@bduga

So we agree that not all resources referenced by a content document are part of the publication, which is the reason why the manifest defines that boundary.

This also means that it's the author's responsibility to list resources that are deemed as important enough to be listed as secondary resources. Then why do you object to listing secondary resources being a SHOULD?

w3c / wpub

Minimum Viable Manifest #15