Open mattgarrish opened 5 years ago
Good catch!
my current take on it, but I am open to alternatives, is
I perceive that there is a move from what was agreed before = the possibility to link WP resources to the manifest, to a new type of linkage = the possibility to link WP resources to the PEP.
I agree with this move, as the agreement is that the PEP is the entry point of the WP, so it makes sense to use it as a "boot record".
But I don't remember it was explicitly agreed by the WG ; and modifications have to be applied to some wording in the spec, e.g.
3.3.3 "Although any resource can link to the manifest, ..." 3.4.2.2 "With the exception of the primary entry page, linking a resource to its manifest is OPTIONAL. ..."
- 'home', that appears in the wiki page, referred to from the HTML spec, seems to be the most appropriate one...
I'm just worried about using a generic term because: 1) it won't ever be able to convey that the resource belongs to a publication (e.g., it will be ambiguous when a page is referring to its actual site home page or a publication); and 2) I'm not sure whether the idea that a resource can belong to multiple publications meshes with the idea that a page has multiple home pages or start pages (i.e., will multiple links of a generic type be ignored).
I think we should have a clear idea why we need those back links in the first place? What is the use case from a WP point of view to have such link elements?
I think we should have a clear idea why we need those back links in the first place
Given that user agents don't generally expose links, I wouldn't put much faith in that happening. But, where it seems like it might be useful would be for SEO (e.g., so that a search result could list what publications a resource belongs to).
Maybe that could just be done with isPartOf, which also seems to be in schema.org/CreativeWork? The author could wire the semantic up onto a hyperlink for the user.
@mattgarrish just to understand...
Maybe that could just be done with isPartOf, which also seems to be in schema.org/CreativeWork? The author could wire the semantic up onto a hyperlink for the user.
Meaning that the author of a, say, chapter, could put a standard schema.org data, in JSON-LD, RDFa, or microdata, using isPartOf
? That sounds like a perfectly fine approach to me. Browsers (as far as I know) do not do anything with a <link rel='home'...>
anyway...
Meaning that the author of a, say, chapter, could put a standard schema.org data, in JSON-LD, RDFa, or microdata, using
isPartOf
?
Exactly. It could just be placed on an explicit link back to the PEP, like so (hoping I have this right):
<div vocab="https://schema.org" typeof="CreativeWork">
<a href="index.html" typeof="CreativeWork" property="isPartOf">Moby Dick</a>
</div>
Which just shows that schema.org does have a bunch of things to offer...
Should be part of some best practices doc, I guess. Or do think it should be part of the main spec?
Should be part of some best practices doc, I guess. Or do think it should be part of the main spec?
Ya, I think somewhere in between -- maybe a note that resources that need to be linked back to the PEP should use available web mechanisms, or something vague along those lines, with the actual practice in a BP doc. Would be useful in the section where we limit linking to the manifest to the PEP.
If we're not expecting any behaviour from it, we shouldn't formally introduce anything in the specification. The world changes quickly...
If we're not expecting any behaviour from it, we shouldn't formally introduce anything in the specification. The world changes quickly...
The point of the rel="publication"
pointing to the publication address was for discoverability of the publication itself (i.e. it's canonical address which loads an entry page when dereferenced which itself would contain--or point to...--a manifest).
Imagine the following:
GET /moby-dick/chapter1.html
<html>
<link rel="publication" href="/moby-dick/">
</html>
GET /moby-dick/
<html>
<script type="application/ld+json" id="wpub">
{"...publication...": "...manifest..."}
</html>
Discovering the publication from chapter1.html
provides a UA with the opportunity to "hoist" a reading experience and "re-navigate" to chapter1.html
. The entry page (loaded from the publication's address) becomes the "brains" of the publication and contains its reading order and acts as the "runtime"/state-machine.
The manifest being external from the publication's "brain" is what introduces the weird indirection that seems to keep everyone confused. Because you then move from discovery (from chapter1) => found (publication address) => discovery (manifest) => found (publication address) => ...rinse and repeat.
To accommodate that weird indirection, the manifest could get its own rel
as @iherman suggested--something like rel="publication-manifest"
.
So, the embedded manifest would change to:
GET /moby-dick/
<html>
<link rel="publication-manifest" href="wpub.json">
</html>
rel
's are relationships between resources, so read in prose the above proposal reads:
chapter1.html
(is in the) publication
`/moby-dick//moby-dick/
(is defined by) publication-manifest
wpub.json
(this bit is implicit in the embedded version)Discovering the publication from
chapter1.html
provides a UA with the opportunity to "hoist" a reading experience and "re-navigate" tochapter1.html
. The entry page (loaded from the publication's address) becomes the "brains" of the publication and contains its reading order and acts as the "runtime"/state-machine.
That's a theory for it, sure, but the way I read Ivan's question is whether there's any reality for it at this time. The specification doesn't detail anything more than how to harvest information from the manifest, and we have a semantic to locate it.
Where we agree is that if we end up with two semantics, we're going down entirely the wrong road. I can readily imagine the confusion we'll cause by only allowing one page to reference the manifest while every other has to reference the page that references the manifest.
But embedding doesn't remove the need for the entry page to have to identify itself as having a manifest. I don't see the day coming when user agents will parse the data of any script tag containing json-ld data anywhere on the web just to see if there might be a manifest inside. There still needs to be a trigger, even if it's a self-reference.
I'm sure this was discussed before, but I can't find the right combination of keywords to locate the discussion.
At any rate, our algorithm requires that processing of the manifest begin with the primary entry page, so we have a significant distinction between:
I clarified in one of the last PRs that only the primary entry page can link to the manifest, and added a placeholder that other resources have to link to the primary entry page. But that still leaves the question, what is the expected relation for linking to the PEP: