w3c / wpub

W3C Web Publications
https://w3c.github.io/wpub/
Other
78 stars 19 forks source link

Clarify the primary entry page #389

Closed mattgarrish closed 5 years ago

mattgarrish commented 5 years ago

This PR addresses two of the concerns raised in issue #386:


Preview | Diff

HadrienGardeur commented 5 years ago

It's a fallback, so we shouldn't be talking about a "default entry" IMO.

It's meant to make things easier for "single resource in the reading order" publications.

mattgarrish commented 5 years ago

It's a fallback, so we shouldn't be talking about a "default entry" IMO.

Right, it's just trying to capture that it gets inserted when creating the canonical manifest if no reading order is specified. I've removed the default entry aspect and linked it to the canonical manifest step.

iherman commented 5 years ago

The latest commit from @mattgarrish is in line with the text in 4.5.1:

The default reading order is specified directly in the manifest, but MAY be omitted when it only consists of the primary entry page . When the default reading order is absent, user agents MUST include an entry for the primary entry page when compiling the canonical manifest.

+1 thus

iherman commented 5 years ago

Let us note, however, that this PR does not solve issue #386. It does not say that the PEP should be within the bounds of the publication. (Which is all right for this PR, it is up to us to drive #386 to a conclusion...)

mattgarrish commented 5 years ago

It does not say that the PEP should be within the bounds of the publication.

Isn't that required now? If the WP MUST have a PEP, then it has to be specified either in the resources or the default reading order, which puts it within the bounds.

Would it help to add a parenthetical to this effect after the first MUST, like:

Every Web Publication MUST include a primary entry page (i.e., this resource has to be listed in the default reading order or resources),

iherman commented 5 years ago

It does not say that the PEP should be within the bounds of the publication.

Isn't that required now? If the WP MUST have a PEP, then it has to be specified either in the resources or the default reading order, which puts it within the bounds.

Where does it say that? I was looking at 3.4, and I did not see this stated. It just says it is a "resource" (in its general form), but that may mean it is part of the "links" array...

So yes, I believe this should be explicitly stated.

mattgarrish commented 5 years ago

I modified the first paragraph of 3.4 to use MUSTs, so it now reads:

Every Web Publication MUST include a primary entry page, which is the resource that MUST be returned when accessing the Web Publication's address. The primary entry page represents the preferred starting resource for the Web Publication, and enables discovery of the manifest.

Or that's what you should be seeing if the preview isn't doing something weird.

If we want it clearer, I can add a parenthetical like I mentioned above.

iherman commented 5 years ago

@mattgarrish that is what I see indeed. And it does not say that, formally, it must be part of either the reading order or the resources array in the manifest, and that is what defines the boundaries...

I think adding the parenthetical remark is actually necessary.

mattgarrish commented 5 years ago

I've rewritten the paragraph, as we're starting to repeat statements elsewhere in the specification. In the immediately preceding Resources section, for example, we say:

A Web Publication MUST include at least one HTML document [html]—the primary entry page.

Instead of repeating that again, I've moved the definition of what the PEP is to the start and made the only MUST that it has to be in one of those lists. The address section also states that it is equivalent of the PEP, so I don't think we need to normatively say again that is the expectation. For quick reference, it's now:

The primary entry page represents the preferred starting resource for a Web Publication and enables discovery of its manifest. It is the resource that is returned when accessing the Web Publication's address, and MUST be included in either the default reading order or the resource list.

iherman commented 5 years ago

As far as I am concerned it is a go!

toshiakikoike commented 5 years ago

Is it not necessary to define the primary entry page in the manifest?

https://w3c.github.io/wpub/#simple-book C.1. Simple Book There are no definition of primary entry page in either "readingOrder" or "resources".

Or is the HTML pointed to by "url" (by default "./index.html") treated as a primary entry page? In other words, in the "Simple Book" sample, is "https://publisher.example.org/mobydick/index.html" treated as a primary entry page?

mattgarrish commented 5 years ago

Or is the HTML pointed to by "url" ... treated as a primary entry page?

Yes.

mattgarrish commented 5 years ago

To respond a little more completely (was travelling earlier), the url property has to reference the primary entry page. As noted in the recent tag review, there are a couple of paragraphs that describe what the address should reference, but those are outdated and needed to be removed in this pull request. They will be removed in a pending update.