We need a section of the document that explicitly defines the bounds of a publication

lrosenthol commented 6 years ago

We talk about the bounds of the publication but we never explicitly define what it means, where it comes from and what a UA is to do with it (and in what specific use cases).

wareid commented 6 years ago

Speaking as a reading system, we would expect the bounds of a document to be everything an author/publisher considers the document. All of the required content (text, images, video, etc.). This can include external links to other resources (i.e. a textbook referring to a website), but those links are not essential to the document and therefore would not be listed in a file list of any kind. Should the external links fail in any way (offline, out of date links, armageddon, etc.), the core document is not diminished. I should add I think publishers will really self-define the bounds of their documents (they're used to it), we should just specify that if they do not include something in the infoset, they should know that they are defining it as unnecessary to the core document.

iherman commented 6 years ago

I think the tension is to

have enough information to the reading systems if they do offlining/caching/packaging and
to reduce the amount of redundancy for the authors, ie, not to force the author to list references that they already list as part of the content itself (e.g., CSS files)

Note that, elsewhere in the current draft, we've already made some steps along (2), namely for the title of the document (that can be extracted from the <title> of the primary entry page) or the Table of Contents.

(2) may suggest to define some sort of an automatism based on, say, the content files themselves; we have discussed (in Toronto) that CSS, javascript, and image files might be good candidates for such automatism: ie, the User Agent, would automatically include these resources in, essentially, the 'resources' infoset item without the necessity to list those in the corresponding manifest item. However, one of the main problems with this is the indiscriminate nature of all this: ie, including, eg, CSS files or JS libraries that are not even under the author's control.

I can see three different approaches that we could follow, in an increasing level of complexity.

All CSS, Font, Javascript, JPG, PNG, GIF, and SVG(Z) additional resources, that are referred to from a resource listed in the Default Reading Order, are automatically considered to be part of the "list of resources" infoset item if:

(simplest version) the URLs of the additional resource is a relative URL with the URL of the primary entry page as a 'base'
(moderate complexity) the URLs of the additional resource is a relative URL relative to a URL listed in a (new) scope manifest item, itself may be a relative URL with the URL of the primary entry page as a 'base'. A missing scope falls back to version (1) above.
(full complexity) the URL-s of the additional resource maps against a URL template (rfc650) defined in the manifest via a separate template manifest item. A missing template falls back to version (1) above. (Note #67, a very old issue about templates).

(Ie, (1) is always available, (2) or (3) means some additional possibilities for the authors.)

For example, in the "Single Document Example" there is no need to list the local CSS and JS files in the list of resources, but some standard reference (in that case) to "central" CSS file or logos should be listed explicitly for the purpose of, say, packaging.

Personally, I'm mildly in favour of (2) above, but not sure it is worth the trouble. As I mentioned in #67, (3) may be for a future release...)

bduga commented 6 years ago

@iherman For case 1, does that mean an author would have to provide base in every html file other than the entry page? What about references from non-html files, like css?

iherman commented 6 years ago

@bduga

does that mean an author would have to provide base in every html file other than the entry page?

Not sure, to be honest. That would be a solution, but it is obviously a drag for the author. That may be a good argument for the second approach, actually: by adding a list of URLs as scope(s) the author of the metadata can have a finer control and avoid the side-effects of having a separate base statement in each content file.

What about references from non-html files, like css?

You mean, for example, a css file imported by another css file? Apart from being a pain for the User Agent, I guess that can be covered, too: the imported css file has, eventually, its own URL that can be compared against the manifest for scope or the URL of the primary entry page... But you are right that this should have been mentioned in the proposal.

BigBlueHat commented 6 years ago

We talk about the bounds of the publication but we never explicitly define what it means, where it comes from and what a UA is to do with it (and in what specific use cases).

Still not sure anything above this comment answers @lrosenthol's questions.

"bounds of the publication"

what does it mean?
where does it come from?
what is a UA supposed to do with it (and in what specific use cases)?

We keep getting caught up on the "where does it come from?" rather than the more foundational "meaning" and "usage" questions. Let's zoom back out.

TzviyaSiegman commented 6 years ago

I think it would benefit us to look at prior art for the way offlining etc is done on the Web today. Even if we do not use tools like Service Workers, we should look at how SW approaches this. Service Workers defines a scope to focus what should be offlined, which is similar to defining the bounds of the WP.

https://w3c.github.io/ServiceWorker/#dfn-service-worker-registration

A service worker registration of an identical scope url when one already exists in the user agent causes the existing service worker registration to be replaced.

iherman commented 6 years ago

@BigBlueHat, also related to what @TzviyaSiegman just said: the question, for me, is very pragmatic. When I create, say, the manifest for the W3C document that is in the example, should I add to the list of resources all the CSS files that are used by the document or not? And this is obviously related to packaging and or offlining/cashing/whatever. (For EPUB 3, the answer is 'yes', I should do it.)

What is the UA supposed to do with it (and in what specific use cases)?

I do not really see what the issue with this is: the offlining/cashing/packaging is a pretty clear example of what the UA is supposed to do with it. Of course, if a particular UA does not do any of these, it can probably ignore the "bounds". Because the author would not have to do too much, that is not really an issue for her. But if the UA does those things, e.g., if the WP is packaged into an EPUB4, then we have to define very clearly what is within the bounds, which is the way I interpret the original question of @lrosenthol.

TzviyaSiegman commented 6 years ago

@iherman In the world of SW, no I do not add all CSS files, etc. I list the URL of the document to be cached. If this works for Service Workers, it should work for whatever offlining we are using.

iherman commented 6 years ago

@TzviyaSiegman I am not sure I understand... a script using service workers should know which CSS files to offline, doesn't it? I don't think that is done automatically by service workers...

iherman commented 6 years ago

@TzviyaSiegman look at the service worker script of https://hpbn.co. It lists all the assets, including svg images or css files, explicitly...

TzviyaSiegman commented 6 years ago

thanks @iherman - someone gave me incorrect information.

dauwhe commented 6 years ago

I am not sure I understand... a script using service workers should know which CSS files to offline, doesn't it? I don't think that is done automatically by service workers...

Much depends on how the service worker script was written. Most examples I've seen explicitly list the resources to be cached, including CSS files and fonts. It's also possible to write a service worker that would automatically cache resources associated with a particular HTML resource, as @BigBlueHat has done. Service workers do nothing automatically—they are just a low-level tool that developers can use to create a caching strategy.

iherman commented 6 years ago

@dauwhe

It's also possible to write a service worker that would automatically cache resources associated with a particular HTML resource

Indeed, that is what a User Agent would do. However, unless we precisely define what resources should be associated with a particular resource for the purpose of the UA, we may have problems with interoperability. Hence my proposal to define more precisely what should be part of the association (clever UA-s may decide to go beyond that, but that is a different issue).

css-meeting-bot commented 6 years ago

The Working Group just discussed Github issue 205.

The full IRC log of that discussion

<wolfgang> Topic: Github issue 205
<tzviya> github topic https://github.com/w3c/wpub/issues/205
<wolfgang> tzviya: how to offline a publication
<ivan> github topic: https://github.com/w3c/wpub/issues/205
<wolfgang> tzviya: how to define the bounds of a publication?
<wolfgang> ... how would offlining work?
<Hadrien> q+
<timCole> q+
<tzviya> ack garth
<tzviya> ack Hadrien
<duga> q+
<dkaplan3> q+
<tzviya> ack timCole
<wolfgang> hadrien: packaging or caching? 10 different ways of caching? we will never be able to say how it works
<wolfgang> tim: one of the challenges raised - publishers should be able to caching a whole wp - what do you want to cache (ref to parts)
<tzviya> ack duga
<Hadrien> +1 to what Tim said
<tzviya> +1 to brady
<ivan> +1 to brady
<ivan> q+
<wolfgang> brady: taking a complete wp offline - not equal to caching
<tzviya> ack dkaplan
<wolfgang> dkaplan: caching or packaging is part of implementation while offlining is an affordance
<tzviya> ack ivan
<wolfgang> ... premature technical issue (caching or packaging)
<wolfgang> ivan: whatever we do, we need to consider the bounds if we want to offline/cache/package - at the moment it's very vague - we have readingOrder and resources, but what about images, etc.
<dkaplan3> q+
<wolfgang> ... how can author get a level of control what is inside the wp when offlined/cached, etc.
<tzviya> ack dkaplan
<George> 6q+
<tzviya> q+ George
<tzviya> q+
<tzviya> ack George
<wolfgang> dkaplan: my concern is that every time this topic came up, the difference between caching vs. packing came up - we are talking about the affordance of offlining
<wolfgang> george: wp should have a mechanism to create an epub 4 for this wp - to offline a pub when a student goes home at night, not the same as a product for sale
<tzviya> ack tzv
<ivan> regrets+ matt
<wolfgang> tzviya: we need to know the bounds of the publication - hard to make this assessment - focus on the technical issues

BigBlueHat commented 6 years ago

There's two things going on in most ServiceWorkers--catching fetches (dictated by origin and scope) and populating a cache/storage from which to (potentially) return values.

Populating the cache/storage is what we've primarily been discussing.

It can be done exhaustively by populating the cache from a predefined list.

Or it can be done progressively by catching the fetch's as they happen and populating the cache from there: https://github.com/dauwhe/html-first/blob/gh-pages/sw.js#L21-L23

In either scenario, the question is about cache/storage populating and how much the UA needs to know when in order to properly populate that storage for the right scenarios.

Consequently, we'll benefit most from defining explicit scenarios (i.e. "reader wants the whole publication" or "reader wants chapter 4" or "reader wants video 1 on page 3") and then building what's needed for each/all of those.

iherman commented 6 years ago

@BigBlueHat while what you describe about how user agents can do what they do in terms of caching or anything similar is perfectly fine. Describing the various scenarios is important (and is a partial answer to the original question of @lrosenthol) and should be done alongside the affordances' section.

However, at this moment we simply do not say what are the resources that we are talking about. The only thing I am interested here in is to define, in an interoperable way what are the resources that come into the picture in the first place.

More exactly: we do have the list of resources and the resources in the default reading order. The question that we MUST answer is: are these to be considered as an exhaustive list for caching (or whatever similar operations, I do not care of the details right now) or, for example, search (eg, search into SVG files)?

Answering 'yes' is a consistent answer, and this is the equivalent of EPUB. It is a pain for authors but one might say that tools can generate those lists, so it is not such a big deal. On the other hand, it makes life of UA-s very easy.
Answering 'no' (which is, essentially, the case today because this is left open) leads to the question of 'What else then?'.

If the author wants to be provide a WP that is prepared for various scenarios in an interoperable manner, then she must know the answer to these questions. This is not the case today.

All I did in https://github.com/w3c/wpub/issues/205#issuecomment-401766036 was to propose some possible answers to these questions: the WP would consist, in terms of offlining/cashing/packaging/whatever, but also in terms of search and other possible features, of the resources on the reading order, the extra resource list, plus whatever is in https://github.com/w3c/wpub/issues/205#issuecomment-401766036 (modulo some comments of @bduga in https://github.com/w3c/wpub/issues/205#issuecomment-401806313). It strikes me as providing a balance between the author's ease of producing a WP and providing an exhausting set of information.

(An oft quoted fact: what WP brings to the table, as a concept, is the fact of talking about a collection of resources as one conceptual unit. As a minimum we should be clear what this collection consists of...)

HadrienGardeur commented 6 years ago

More exactly: we do have the list of resources and the resources in the default reading order. The question that we MUST answer is: are these to be considered as an exhaustive list for caching (or whatever similar operations, I do not care of the details right now) or, for example, search (eg, search into SVG files)?

This shouldn't be affordance specific (caching), but yes I believe that these two lists taken together are the only real bounds for the publication.

We can discover additional resources through other means, but we can't know the intent of the author for them.

For caching specifically:

the UA SHOULD implement its own Service Worker
it SHOULD implement a network then cache policy
it SHOULD prefetch and cache all resources listed in the reading order and list of resources
it MAY prerender in the background all resources from the reading order to make sure that additional resources are prefetched and cached as well

I think that this is the most that we can do. Anything more than a "network then cache" policy could interfere with the expected behaviour of the publication and we can't require all UAs to prerender all resources in the background (this is very CPU/RAM intensive and should be decided based on the device being used).

Packaging is a separate issue, but if we adopt ZIP for EPUB4 this will require quite a lot of processing on the UA's part in order to rewrite references to various resources in the reading order and list of resources. We might want to keep packaging on the side until Web Packaging is ready for primetime.

iherman commented 6 years ago

@HadrienGardeur just to be very specific and see if I understand your intention.

If I create a WP out of (say) our own W3C WP draft (something that a recently added javascript extension to respec already does), and my intention is that I should be able to read the draft on the plane (via some suitable WP extension in my browser or some other additional service) I am supposed to also list all the CSS and image files that are in the "common" subdirectory of our specification, then I am supposed to list all those CSS and image files in the "resources" array in the manifest?

I realize this is doable, but it is nevertheless a drag for the author. I also realize this is how EPUB3 has been defined. My intention is, however, to make life easier for those who author such a WP for a, I believe, fairly frequent usage case.

HadrienGardeur commented 6 years ago

@iherman not necessarily.

If the UA implements all the things that I've listed above for caching, it'll work offline entirely even if you don't include all CSS/JS/images/fonts in the manifest. Search and other affordances though might be limited strictly to what's in the manifest.

iherman commented 6 years ago

@HadrienGardeur

If the UA implements all the things that I've listed above for caching, it'll work offline entirely even if you don't include all CSS/JS/images/fonts in the manifest.

You mean "it will work" because it can be displayed without any styling, or it will work with all the styling because the UA gathers all the CSS and image files? I presume, if the latter, this is not something done by some magic (ie, by some system code) but because the UA has this all encoded. Am I missing something?

If the UA does the gathering, it must have some of its own, ad-hoc policies, which may become an issue in interoperability. How does it know which files, referred to from a top level content, should be cached and which one should not?

HadrienGardeur commented 6 years ago

@iherman it will work with styling as well because the UA:

cached these resources the first time that you displayed a given resource from the reading order
or because it prerendered all resources from the reading order in the background and cached associated resources accordingly

For the policy, that's why I recommended using a simple "network then cache" policy for the Service Worker, to avoid as much as possible interfering with the HTTP headers in each resource's response.

iherman commented 6 years ago

@HadrienGardeur if this works indeed that smoothly, I am fine with this and we can drop (at least this part) of the issue, though some form of a (informal) description of this may be useful in the draft.

However, looking ahead to EPUB4 which is, in my view, simply a packaged version of WP, this means that the tool for the packaging itself will have to do/simulate those policies (or the author have to be much more explicit if packaging is also a goal). Again, obviously can be done, but it may make the tool more complex. (I am thinking in terms of something like ZIP, not in terms of Web Packaging, which may still be way down the line...)

HadrienGardeur commented 6 years ago

@iherman well it's not always simple...

We can't expect the UA to always prerender everything in the background by default for instance, this is too CPU/RAM expensive.

Based on the device and the browser, we'll also have various limitations for the size of our cache. It's very likely that large resources (audio/video) won't be cached for that reason. There's also a risk that a browser could purge its cache after a period of time.

On the packaging side of things, it would be a much better option for "long term" storage of publications that you'd like to read offline. That said, I don't think it's doable with ZIP, or at least it will be very complicated.

Packaging is IMO only viable once Web Packaging is available.

There's a good reason why I'm always dividing things into caching/packaging, they impact the user experience considerably, it's not just a technical issue.

iherman commented 6 years ago

@HadrienGardeur, o.k. But what should then be, in your view, the answer to the original question (in this respect) of the issue? And what should be a reasonable strategy for an author when creating a manifest with the list of resources? This is still blurry to me...

As for ZIP: I may be wrong, but I think that the community will vote for ZIP for EPUB4. I am not sure Web Packaging will be mature enough with enough tools around to base PWP on it (at least in the lifetime of this Working Group). Alas!, I would add.

HadrienGardeur commented 6 years ago

@iherman if you want to be sure that your resources will be cached, you must include them in the list of resources.

If not, there's always a risk that they won't be available offline.

It's also import to point out that large resources may not be cached anyway, no matter if you include them in the list of resources or not.

For packaging in a ZIP, we would either need to:

rewrite URLs in all HTML, CSS, SVG and JS resources
map URLs to a local path in a ZIP

Both of these solutions will require a lot of work and I'm not sure they would be possible on all platforms.

I really think that we can't reasonably expect to be able to package a WP without Web Packaging being widely available in browsers.

GarthConboy commented 6 years ago

I have been leaning more toward Zip for EPUB4. I don't find the URL issues above to be insurmountable. But, that's a bridge we can burn somewhat later.

HadrienGardeur commented 6 years ago

@GarthConboy I'm leaning towards ZIP for EPUB4 as well, but I still think that this impacts our ability to create a PWP from a WP.

Web Packaging is really designed for this specific use case, since it truly extends how HTTP normally works in this context.

For the two options with ZIP that I listed above:

rewriting URLs is prone to error and raises some serious security issues as well
serving local packaged resources when a specific URL is requested could be handled using a Service Worker, but I'm still concerned about our inability to serve the proper HTTP headers in these responses

dauwhe commented 6 years ago

We talk about the bounds of the publication but we never explicitly define what it means, where it comes from and what a UA is to do with it

Emphasis mine.

What happens if a user clicks a link in a web publication that points outside the web publication? Presumably that means leaving the publication mode (see also #276). Is "publication mode" a top-level browsing context? Do we need to define things the way that WAM defines navigating beyond the scope of a web app?

If the URL of the resource being loaded in the navigation is not within scope of the navigation scope of the application context's manifest, then the user agent MUST behave as if the application context is not allowed to navigate. This provides the ability for the user agent to perform the navigation in a different browsing context, or in a different user agent entirely. If during the handle redirects step of HTML's navigate algorithm the redirect URL is not within scope of the navigation scope of the application context's manifest, abort HTML's navigation algorithm with a SecurityError.

mattgarrish commented 6 years ago

We started to consider this when we were looking at the reading order and resource list and had this prose for external resources:

If a user agent encounters a resource that it cannot locate in the resource list, it MUST treat the resource as external to the Web Publication (e.g., it might alert the user before loading, open the resource in a new window, or unload the current Web Publication and resume normal Web browsing).

But according to a note in the document, the text was pulled during the last f2f.

iherman commented 6 years ago

This issue will be discussed at the F2F; I thought I would share a specific example that may help our discussion. This example is not a traditional book, it is a scholarly article. (@atyposh and @TzviyaSiegman will appreciate...).

Look at https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2006229. It is a bona-fide scientific article. Note that the article (what we would call a WP) is not the Web site you see; the site contains other items, including (at least on my screen) an advertisement for Leica and a CFP for another journal. In our terminology it is a reading system that displays the publication, which also has a number of other user interface goody (e.g., download the paper in PDF...).

The publication contains the HTML text, but also links to a number of other resources: if you scroll down there are figures with links to larger versions, to a PP slide and, actually, when clicking on a figure it shows a separate panel that allows zooming into an image (ie, I suspect those panels are separate resources with some Javascript). There are also data referred to from the paper, eg., at https://doi.org/10.1371/journal.pbio.2006229.s017 (this is an excel sheet). In my book, all of these are part of the publication.

Obviously, I want to read this paper and look at the data and zoomed images offline. (Note that the reader mode, at least in Mozilla, is very crude for anything beyond the pure text, it is not appropriate for a thorough read. Nor is the PDF version, for that matter.) But I do not necessarily want the Leica advertisement, the CFP, or even the download link to PDF. So the 'boundaries' must be clearly established. Some questions/comments:

What are the boundaries of this paper? I think the HTML+CSS content, plus all the images, PP and Excel files, and the JS necessary to control that zooming panel. (There may be more.)
It seems to be unavoidable that the 'resources' should list at least the PP and Excel file, which are simply listed as DOI-s without any further hint (in the URL) as for their type. They are in our 'resources' list and not part of the ReadingOrder.
What about CSS? It is difficult to judge because the display of the article does not 'separate' the content of the article from, eg, the commercial. In its present form, the CSS relevant to the paper and the paper only cannot be scraped, nor can the references to images be separated from the images that are used for other purposes on the screen. What this tells me that this is one of the cases where all resources may have to be explicitly listed to establish the boundaries.

There may be other good questions... I found the example interesting.

JayPanoz commented 6 years ago

@iherman At the core of your wonderings, that’s probably where web origin policies (same-origin, CORS, etc.) are very likely to come into play, at least in browsers – if that can answer some questions.

I’d say taking a look at reading modes is not necessarily the best idea there, because there have other goals e.g. stripping ads, JS, CSS etc. and put a lot of heuristics in place to achieve those goals.

We’re going back to the opaque origin issue. To put it simply, if it’s opaque it’s probably out of bounds. But once again that’s for browser vendors/user agents to confirm – you can’t really tell what they will do under the existing policies.

lrosenthol commented 6 years ago

I think we need to start by understanding who determines the boundaries of the WP. IMO, they are defined by the author of the WP! With that being the case, then what you as the reader want (no ads, etc.) doesn't matter and we don't need to make any decisions. We simply need to allow the author a way to define them.

It might be an extra feature of a UA to offer you a choice of which things to take offline (instead of all things) - but the starting point isn't the reader.

On Thu, Oct 18, 2018 at 2:21 AM Jiminy Panoz notifications@github.com wrote:

@iherman https://github.com/iherman At the core of your wonderings, that’s probably where web origin policies (same-origin, CORS, etc.) are very likely to come into play, at least in browsers – if that can answer some questions.

I’d say taking a look at reading modes is not necessarily the best idea there, because there have other goals e.g. stripping ads, JS, CSS etc. and put a lot of heuristics in place to achieve those goals.

We’re going back to the opaque origin issue. To put it simply, if it’s opaque it’s probably out of bounds. But once again that’s for browser vendors/user agents to confirm – you can’t really tell what they will do under the existing policies.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/wpub/issues/205#issuecomment-430715002, or mute the thread https://github.com/notifications/unsubscribe-auth/AE1vNVqsdhzuPv2pe1t0-28iQ60jra3kks5ul2clgaJpZM4UTsm- .

iherman commented 6 years ago

This issue was discussed in a meeting.

RESOLVED: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication & close #205
View the transcript
Tzviya Siegman: Reading from the agenda
Dave Cramer: Need to put some effort into describing this in an operational way
Dave Cramer: https://github.com/w3c/wpub/issues/194#issuecomment-428662128
Dave Cramer: Say I am in a WP context and click a link to another WP?
… what happens then?
… How do we discard a manifest?
… Easy to talk about boundaries are, but what do they mean?
Leonard Rosenthol: The concept I agree with
… Thinking about boundaries from UX perspective
… eg my goal is to search this publication - what is a WP to accomplish that
… Look at use cases as they relate to boundaries
Benjamin Young: Dave mentioned UX. We talked about constraints on UAs
… Biggest one we have to consider is what happens when you cross the boundary
… There is some precedent in the web, eg web manifest
… inverse is iFrames, you pull things into your scope
… third is target:blank which insists you leave the publication
… Anything out of scope takes you to a browser context
Hadrien Gardeur: Glad to hear this example, we had inter document linking discussions at epub
… Nice to finally be able to link between pubs
… Earlier we didn’t have the right terminology to discuss the boundaries
… No longer true. The scope is now expressed.
… Agree we have established patterns for what happens
… no need to reinvent the wheel
… When you are no longer in the bounds of the pub, the affordances are no longer available
Tzviya Siegman: We seem to be agreeing
… Goal is to address issue 205
… Maybe we are done with this?
… maybe we just need to be more specific about what happens and the UX for when we leave the pub
Garth Conboy: Searching - maybe that is a should, clearly you need bounds for that
… Are we comfortable saying we are now done?
Leonard Rosenthol: Concerned we are talking about 2 different things regarding bounds
… 1 is what the UA understands are the bounds (eg for search)
… Seems clear why we need that
… Issues around exiting and entering is a completely different issue
… Nothing to do with the actual bounds
… Just a UX issue, which is still important
… Look at both, but do not combine
Romain Deltour: Security issues - what happens when you move between origins
… Origin historically undefined in epub world
… Pubs can share local storage, etc
… Bounds is an opportunity for us to tackle this issue
… Do you have to examine every resource to determine origins?
Ralph Swick: -> https://github.com/w3c/wpub/issues/205 205 - We need a section of the document that explicitly defines the bounds of a publication
Ivan Herman: Issue of bounds depends on what we discussed before the break
… May be ok to say search is only for things in the resource lists
… but may not be true for offlining
Hadrien Gardeur: But those are the same bounds?
Ivan Herman: But do we need to list eg CSS?
Dave Cramer: case 194 talks about links to items in multiple pubs?
… Do you need to define the various combinations of navigation actions?
Benjamin Young: How ready are we to decide things like experiential actions like
… clicking on something outside of bounds is different than inside?
… Are we at a place to deal with that now?
Liisa McCloy-Kelley: Yes, we are!
… If you are in the bounds you should know that
… There should be some experiential way of knowing I am navigating to something I “own” to something I don’t
Dave Cramer: Important web principal - how can a user trust their content (or not)?
Benjamin Young: +1 to experience mapping to user trust
Dave Cramer: web app manifest has a lot on this, about indicating to user they are in some special mode
Benjamin Young: There was a mention about web apps be similar but non standard
… There is no consistency promise
… Web pubs should have more trust - clear you are in the pub
… Adding that expression of trust is valuable to publishing
Tzviya Siegman: We are revisiting why we need boundaries
… But we have already discussed that
… Need for security, offlining, wayfinding, etc
… Need to focus on how not why, people!
Luc Audrain: User trust is fine, but also need to consider author trust
… Something has been “published”
… Bounds are important to verify that
Hadrien Gardeur: In the case of web apps, it is similar to how epub RS often work
… You have a context, when you go out of it, may open a browser or web view
… so you are now in a different UX context
… Compared to web once I have switched, I don’t have the same expectation of how I get back
… Web apps often don’t really support back
… From a UX standpoint fairly common way of handling bounds
Leonard Rosenthol: Take a use case, say offlining
… Use as an example to figure out what we need
… Have default reading, resource list, etc
… [reading from spec]
… “The bounds are defined as the union of resource list and default reading order”
Benjamin Young: Before we go there
… We have avoided UA requirement so far
… Should we define those now?
… Things like leaving the pub, etc
… Should we just file issues and make Josh do them?
Tzviya Siegman: Yes
Dave Cramer: Can we define expectations?
… Eg if you do search, these are the ones you should search
… Those can be tested
… Have an operational definition instead of saying “this is a boundary”
… Breaking the back button is really bad
Hadrien Gardeur: I was just pointing out how web apps work
Dave Cramer: I would be unhappy with pubs that did that
Joshua Pyle: Is it possible you have something in bounds that is not in the reading order?
chorus of voices: yes
Tzviya Siegman: https://w3c.github.io/wpub/#resource-list
Joshua Pyle: Does that mean what Leonard says was wrong?
Ivan Herman: No, you had it wrong, it is the union
Ivan Herman: I am fine putting this into a resolution
… and then we can close an issue
… it puts responsibility in the authors that if they expect eg offline to work
… then they better put the CSS in the resources
… Which is fine, but we need to decide
… I propose we do it now!
Garth Conboy: Agree
… Better flesh out your resources
Leonard Rosenthol: I support that
… The things we need to iterate on are various things we have discussed
Proposed resolution: leave draft language as is, change from a note to text, close the issue (Brady Duga)
Leonard Rosenthol: Change the language to “this defines the bounds of the publication”
Dave Cramer: What happens if you have a resource that links to CSS outside the bounds
Ivan Herman: What happens today if you have something in the cache that refers to an external file?
… That is totes the same
Dave Cramer: I am ok with not required
Tzviya Siegman: Objections?
Romain Deltour: Need to understand what happens when there are multiple origins
Ivan Herman: Need extra constraints on resources
Leonard Rosenthol: You are viewing the bounds wrong
… The fact that you can reference external CSS is irrelevant
… Bounds are what the list says, not where they are
… How we deal with bounds is another question
Benjamin Young: There is some prior art that is painful
… eg app manifest
… which is being replaced by service workers
… No master list, it just puts referenced things in the cache
… No need for boundaries
… Further constrains a service worker
Brady Duga: these lists of resources are all great
… once upon a time there was a thing called epub
… we had a manifest
… then we had a package
… which also had a list of files within the zip format
… and the only thing that we got from the manifest
… was errors
… when you’re not considering packaging, manifest sound great
… but when you get to packing, you probably will hate the manifest
Tzviya Siegman: do you want these to be the same in WPUB and EPUB4?
Brady Duga: probably not
Garth Conboy: A wp doesn’t need to be packaged
… but we could make a rule that the list is expunged by the time we package
… I thought we were sort of close to agreeing that the bounds was the union
Leonard Rosenthol: Didn’t we agree?
Liisa McCloy-Kelley: We did have call for objections
… I disagree with duga, I think it was very useful to have a list of resources
… and am not opposed to it in WP
Dave Cramer: Does the current web packaging spec have have features that support WP?
… Is there something there that we should pay attention to?
Tzviya Siegman: dauwhe++
Dave Cramer: Having some alignment with web packaging would be a lovely alignment
… Hope we can coordinate with them
Romain Deltour: +1
Tzviya Siegman: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication.
Ivan Herman: +1
Tzviya Siegman: Can we agree with the statement in the spec now [reading from spec]?
… “The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication.”
… Do we agree?
Avneesh Singh: +1
Luc Audrain: +1
Gregorio Pellegrino: +1
Wolfgang Schindler: +1
Ivan Herman: Do we close 205 with this?
Tzviya Siegman: Yes?
Dave Cramer: +1 in that this statement is necessary, but not sufficient. There is more work to be done with boundaries and user experiences
Wendy Reid: +1
Resolution #2: The union of the resource list and default reading order represents the definitive list of resources that belong to the Web Publication. All other resources are external to the Web Publication & close #205
Tzviya Siegman: Objections?
Wendy Reid: also +1 dauwhe
Benjamin Young: Not -1
Leonard Rosenthol: +1
Benjamin Young: Don’t want to be the only negative one
… also discussed a CG for exploring this
… and kind of concerned about this
… Opposed because it is underexplored, has security ramifications, etc
… We are pushing ahead due to time, but we need an outlet to properly vet these things
Romain Deltour: +1 to what @bigbluehat said
Wendy Reid: Dave said something like that in his +1
Hadrien Gardeur: +1
Ivan Herman: If new problems come up, it is in our right to reopen
… but don’t want to keep issues open forever
Laurent Le Meur: +1
Joshua Pyle: +1
Garth Conboy: +1
Ivan Herman: at this moment uncertainty is bad
Tzviya Siegman: Now proposing the PCG!
Laurent Le Meur: +1 to Ivan. We have to study implication of this definition of boundaries but can use it as a ground.
Tzviya Siegman: lunch time!
Ivan Herman: —- LUNCH —-
Tzviya Siegman: https://www.w3.org/community/blog/2018/10/22/proposed-group-publishing-community-group/

w3c / wpub

We need a section of the document that explicitly defines the bounds of a publication #205