w3c / wpub

W3C Web Publications
https://w3c.github.io/wpub/
Other
78 stars 19 forks source link

Proposed changes / additions to WAM #127

Closed TzviyaSiegman closed 6 years ago

TzviyaSiegman commented 6 years ago

In https://github.com/w3c/wpub/issues/32 we discuss many reasons why Web App Manifest should or should not be extended. Let's use this issue to propose changes or extensions that PWG requires in order for us to adopt it for WPUB.

baldurbjarnason commented 6 years ago

Off the top of my head, here is one possible breakdown of issues—which admittedly consists mostly of questions so might not be particularly helpful:

Even though we have discussed many of these questions before, it seems likely that they will have to be re-litigated, so to speak, due to the change in constituency. That is, when you add a bunch of people from Web Platform Working Group to the discussion you can't trust that prior consensus will hold. WAM has pre-existing stakeholders who need to be a part of the discussion before we decide what specific changes need to be made to the WAM spec.

We also need to be open to the idea that some of these features will have to be primarily implemented as open source scripts on top of the web platform and don't need to be baked into the platform itself.

Offline and reading order specifically look like issues that could be entirely solved by scripts. Possibly even the ToC. Linking to extended metadata could be done from the start_url without having to extend the WAM.

Anyway, apologies for mostly having questions at this point.

BigBlueHat commented 6 years ago

@baldurbjarnason those might be best filed as separate issues pointing here--as it's unlikely we can address all of them in a single issue. 😃 I think @TzviyaSiegman was looking for actionable proposals for changes and/or extensions to the Web App Manifest format and processing.

TzviyaSiegman commented 6 years ago

Thanks @baldurbjarnason, but I was hoping we could identify individual items in WAM that we might need to change or expand upon. Would it be possible to point to specifics in WAM? I don't think (for example) that offline is a WAM issue.

BigBlueHat commented 6 years ago

One key extension point that seems easy to imagine is an new value for the display member property: https://www.w3.org/TR/appmanifest/#display-member

That property provides a statement of desired UX expectations. Current values are: fullscreen, standalone, minimal-ui, and browser.

One might imagine a publication or reader display mode that would enable linear-progression UX features, progression state storage ("2/3rds of the way down resource Y"), and other things being discussed in #52.

TzviyaSiegman commented 6 years ago

Based on @HadrienGardeur's work on https://github.com/w3c/wpub/issues/118, we would recommend adding creator or author.

The creator member can appear 0 or more times and should be modeled on existing creator structures such as http://dublincore.org/documents/dcmi-terms/#elements-creator or http://schema.org/creator

baldurbjarnason commented 6 years ago

On adding a new value for the display member property: I worry that tying too many UI affordances to a single web app display mode will be unpopular because then you bring modality back to HTML itself.

WAM display modes are more about dictating the chrome provided by the browser, leaving mode-relevant styling and behaviour up to the website itself using things like the display-mode media query.

We might have more success with proposing a display-mode that only affects browser chrome (no page progression, state storage or the like). As in, the only change would be a change in display-mode that the website can detect and then implement publication things itself, as well as a button in the chrome that toggles you to and back from the ToC for that specific resource. The ToC could even be dictate on a resource by resource basis using a link rel=contents. This would be relatively simple for browsers to implement, be a clear and overt indicator of 'publication-ness' to users and would let us bootstrap publication-specific features using scripts.

Couple that with creator and cover properties as well as improved internationalisation (which benefits everybody) and you have the most minimum viable publication specification I can think of. It would both be relatively simple for browsers to implement and would give us something concrete to build on.

In many ways such a solution would be independent to a more fully featured manifest like the Readium manifest. We could use something based on the Readium manifest format to configure the publication scripts and provide a long-term pathway towards specifying more publishing-oriented formats that aren't be directly tied to service workers and the WAM, like EPUB4.

TzviyaSiegman commented 6 years ago

Further comments based on #118: add published and modified for publication date and last modfication date

TzviyaSiegman commented 6 years ago

@baldurbjarnason

On adding a new value for the display member property...

This is not a proposal to move forward with this method. But, if we do propose changes to WAM, this is a place to list and discuss there merits. Please discuss whether to work with WAM in https://github.com/w3c/wpub/issues/32

baldurbjarnason commented 6 years ago

@TzviyaSiegman My second comment was not on whether to work with WAM. It was on the specifics of a proposed change to the WAM: display-mode coupled with other additions like creator and cover.

js-choi commented 6 years ago

WAM has an existing extension mechanism. (It also used to use JSON-LD, which is instrinsically extensible, but that was removed a while back.) For each missing use case here, WAM’s existing extension mechanism should be considered, versus modifying the WAM format itself. If the mechanism is inadequate, then requesting changes to the mechanism to make it more powerful may also be an option.

llemeurfr commented 6 years ago

First a note: Too bad that the JSON-LD extension mechanism (a W3C reco, part of the Web) was removed, it would be given clear guidelines for modeling extensions.

The first evolution we could request would be to add a categorization property (which could be called "type") that will trigger specific behaviors of the user agent like:

Rationale: As a user, I want to be able to catalog publications in a specific screen of my user agent, which can be called a 'bookshelf'. I want to be able to classify, filter, sort publications using their exposed metadata. Opening a publication should trigger a reader mode, which will use my preferred user settings (font size etc., selected once for all publications I will read).

llemeurfr commented 6 years ago

The second evolution we must request is enhancement of the i18n feature of WAM. As expressed by our Japanese colleagues, even in an online environment where a query can request a selected language, several string properties - like the title of a publication - must be provided with alternative scripts:

We've got 2 open issues on the subject: https://github.com/w3c/wpub/issues/1 https://github.com/w3c/wpub/issues/124

and @HadrienGardeur expressed this again in https://github.com/w3c/wpub/issues/32#issuecomment-362341412.

ps: this is not related to the WAM use-case (online only), but we consider the manifest as a constituent of EPUB 4, i.e. a generic interchange format. In this use case, the capability to support N alternative languages expressions in one publication is a PWP must (and should be explicitly stated in the PWP requirement, by the way).

HadrienGardeur commented 6 years ago

I'm still working on the lifecycle branch of this repo, but @TzviyaSiegman do you want me to summarize all the things I've listed in #118 in a single Markdown document?

I went through our infoset elements one by one, this should in theory cover all proposed additions/changes to the WAM.

rdeltour commented 6 years ago

Like @js-choi said, WAM has an existing extension mechanism, so we need to think about where this can be used, and where this cannot. Basically, WAM’s extension model allows us to define new manifest members (and their processing logic). It doesn’t allow us to modify current members’ values, nor to change the obtaining/processing/applying/updating logic; we have to devise whether we’d need extending these non-extensible-by-design things, and where.

My take on what we can envision:

Possible changes in current WAM members

Possible new members

Extensions to the lifecycle?

I’m not sure if we need any, TBD. For instance, Applying the manifest says the manifest is applied to a top-level browsing context, maybe that needs to be extended/modified to cater to reading systems’ requirements?

(edited to add reading order and toc)

HadrienGardeur commented 6 years ago

Regarding the display member, I really don't think that we can expect a new value such as publication or reader to trigger things like offline access or progression handling.

The display member is all about the chrome of the browser and the app: it decides which UI elements are displayed for the Web App.

At best, this could trigger the current reader mode available in browsers and enable affordances for user styles and TTS.

llemeurfr commented 6 years ago

@rdeltour can you detail what can be expected from the minimal-ui fallback display mode, relative to the capability to navigate between resources in the web publication? It the answer is nothing, I don't see how such fallback can be of any use, as it would totally break the user experience. In such a case better having a 'no support' message.

llemeurfr commented 6 years ago

@rdeltour, I agree this is not about WAM modifications per se, but you omit to say that in we would still have to define the publication lifecycle, i.e. how a "publication" user agent must handle reading order, user settings, side-toc, pagination modes ... what make the reading experience specific. Not a small work, and we have to do it in any case.

rdeltour commented 6 years ago

@HadrienGardeur:

Regarding the display member, I really don't think that we can expect a new value such as publication or reader to trigger things like offline access or progression handling. The display member is all about the chrome of the browser and the app: it decides which UI elements are displayed for the Web App.

I absolutely agree, display-mode is purely UI and can’t be used alone to trigger anything else (that’s why another member like type or kind would be needed too).

@llemeurfr:

can you detail what can be expected from the minimal-ui fallback display mode

Basically what the WAM says about minimal-ui. Something lying between browser and fullscreen.

I don't see how such fallback can be of any use, as it would totally break the user experience. In such a case better having a 'no support' message.

I would hope that viewing the publication in the browser would always be a reasonable fallback rather than showing a "no support" message!

you omit to say that in we would still have to define the publication lifecycle, i.e. how a "publication" user agent must handle reading order, user settings, side-toc, pagination modes ... what make the reading experience specific. Not a small work, and we have to do it in any case.

Oh, absolutely. But as you say 1. it’s not the topic of this thread and 2. we have to do that anyway, so it’s not really an argument in favor or against using WAM (except for the part where WAM may provide a framework for this lifecycle logic).

iherman commented 6 years ago

@rdeltour, thanks for the list. I believe we really ought to have a consensus on such list. Three things I would like to add/comment.

  1. I think we should re-raise the JSON-LD issue. Using JSON-LD only means adding a @context line into the JSON file, non-JSON-LD consumer can simply ignore it. However, metadata usage is deeply rooted in the business practices of this community, and JSON-LD is the metadata syntax understood by schema.org. (Actually, the issue on the script element may also be relevant here.)
  2. Although metadata is important, we should not impose complex metadata incorporated in a manifest instance (like putting ONIX or complex accessibility metadata into it). Instead, there should be an extra key that links to external metadata files which can be in any syntax.
  3. On the I18N issues: I plan (an hopefully can do it later today) to propose a PR for the draft along three lines (and these modifications are necessary no matter what):

    1. Each textual value must be “Localizable” (in the sense described in the string-meta draft). This means it SHOULD be possible to “annotate” each value with language and direction;
    2. It SHOULD be possible for each corresponding key to have several possible values, with the possible restrictions that the values should be in different languages in the sense of having different language tags (that would cover the case of the same author name in different scripts)

    The current lang and dir entries in the infoset (or the WAM) would be retained, with the semantics of providing the default language and direction for any textual value that does not explicitly set those.

These are all normative additions or (slight) changes to the WAM spec. The translation of no. 3.i above into JSON is documented in the string-meta draft, and we also got to the same conclusion in #124.

llemeurfr commented 6 years ago

@iherman I agree with most of your suggestions, especially 1. and 3.

Re. 2., I hope we are both in sync: we are designing a format which must be allow users to categorize, filter, sort publications locally (client-side). The format must therefore handle locally (in the manifest) metadata deemed as necessary for such functionalities. Only linking to external metadata (e.g. Onix, a B2B vocabulary) would a blocker for EDRLab.

laudrain commented 6 years ago

I may add that the ONIX for Books community has been already working on a schemo.org expression. So it's complexity may be somehow tamed with the help of schema.org in WPUB.

iherman commented 6 years ago

@laudrain yes, but a schema.org version of ONIX may still be pretty huge...

iherman commented 6 years ago

@llemeurfr hope we are sync indeed...

JSON(-LD) being extensible, it is indeed possible to add additional metadata into the manifest (unless the WAM explicitly disallows it). However, we may have metadata that contains hundreds of terms, and it should be possible to add a "link" into the main manifest towards a dedicated, say, ONIX file that could then be retrieved separately and still processed on the client side.

Is this a problem? If so, why exactly?

llemeurfr commented 6 years ago

@iherman, our perception is that Web Publications must define the minimal set of standard metadata useful for user manipulation (categorize, filter, sort) of Web Publications. We all know that interoperability matters: a link to "other" structures (ONIX, MARC ...) is a very useful extension mechanism, but does not bring much in terms of usage interoperability. Browser vendors will easily implement a minimal set of metadata, useful for end users. They will not implement the hundreds of ONIX terms, most of them crafted for another audience than the standard user, and they should not have to pick and choose by themselves some terms in such a rich vocabulary.

iherman commented 6 years ago

@llemeurfr I think we are in sync. Obviously, there is a minimal set of metadata that we define (essentially the infoset, maybe some more that are missing) and those are part of the manifest. But we should probably keep that list relatively small. However, we must have the possibility in the manifest (and that may require an extension to the WAM) to link to the ONIX, MARC, etc. data.

I believe we are saying the same thing:-)

mattgarrish commented 6 years ago

we must have the possibility in the manifest (and that may require an extension to the WAM) to link to the ONIX, MARC, etc. data.

Do we really need this in the manifest, or could it be in the start_url file? (e.g., embedded or via link/@type=meta)

The further away this information is from search crawlers, the more likely it's going to lead to duplication, or simply be done the way that is more effective for search. Might be a useful compromise if this information isn't critical for processing the manifest.

llemeurfr commented 6 years ago

@mattgarrish to take your words, the further away it is from the user agent, the more likely the user agent developer will not process it at all.

mattgarrish commented 6 years ago

the more likely the user agent developer will not process it at all.

If the information isn't important enough to be included as a manifest property, then what does that matter? What epub reading system uses linked records after all these years? The application that's more likely to use metadata like you'd express in schema.org or ONIX is a search engine.

Also, if the web page that links to the manifest also contains/links to the metadata, why is it all the unlikely that the user agent will process one but not the other?

llemeurfr commented 6 years ago

@mattgarrish it seems that we don't understand each other. I'm saying, and @iherman is in sync with it, that there is a required set of metadata in the publication manifest (= the metadata part of the infoset currently under definition). These are the metadata useful for categorizing, filtering, sorting their personal bookshelf (call it a local search engine if you like). Links to other sets of metadata are an extension mechanism.

But this discussion should not be continued in an issue that is about potential changes to the WAM.

llemeurfr commented 6 years ago

@mattgarrish sorry, I see now what you mean; you are advocating that the link to external metadata should not be in the manifest, but in an HTML resource referenced from the start_url. I personally prefer having the fork not too far from the knife, but it's not a short term issue for us IMO.

HadrienGardeur commented 6 years ago

Links to external metadata should probably be in both, it all depends on the metadata format.

If you're using ONIX, having a link in an HTML resource is useless.

If you're using JSON-LD, it's probably best to have it in the manifest (for WP aware UAs) and in HTML (for crawlers).

mattgarrish commented 6 years ago

you are advocating that the link to external metadata should not be in the manifest

Right, I'm wondering how important an extension this is for WAM. Like with EPUB, I agree with you both that critical metadata for processing and rendering needs to be in the manifest, but this seems like the "what do we do with bibliographic metadata" discussion from EPUB.

Experience tends to suggest that whatever metadata is needed for the user agent needs to be in the format the user agent is processing, and people will push for it to be there. Otherwise, it's not just grabbing the metadata, but also adding the complexity of parsing that information and understanding what is in the linked format.

I'd leave that where it's already being done unless we're ready to go so far as require a specific format and vocabulary for the linked record.

rdeltour commented 6 years ago

before getting sidetracked too much, may I suggest that:

HadrienGardeur commented 6 years ago

I'm still working on the lifecycle branch, should I prioritize turning #118 into a full document instead of that ? cc @iherman

TzviyaSiegman commented 6 years ago

@HadrienGardeur I am not sure we need a document for #118. It would be helpful to see the lifecycle work for Monday's meeting. Thank you!

HadrienGardeur commented 6 years ago

Well, #118 is a long list of:

It could be re-organized and rewritten though, based on some of our more recent discussions (i18n being the top one).

HadrienGardeur commented 6 years ago

Updated version would look something like this:

iherman commented 6 years ago

@HadrienGardeur, you asked

should I prioritize turning #118 into a full document instead of that ? cc @iherman

I would prefer not, for purely admin reasons. Creating a separate document means, formally, have a new FPWD on a separate document, which leads to extra admin complications (re IPR policy). Let us try to keep everything in a single document for now.

(Sorry for the late reply.)

HadrienGardeur commented 6 years ago

Here's a more detailed version of what I posted above then...

1. As-is

start_url is a good fit for the address in our infoset.

We can also use lang and dir as-is, as long as additional text is added to the WAM to indicate that these are only default values that each member may override.

2. Changes

2.1. i18n

Two members from the WAM could be useful for the WP manifest if they added support for i18n: name and description.

Both members are currently using a USVString in their WebIDL definition, which would have to be replaced by a sequence of Localizable (based on this definition).

categories may benefit as well from the i18n support, but this is less important than name.

2.2. Media type registration

The current media type registration for application/manifest+json does not contain any required or optional media parameter.

Considering the large number of additions that our infoset requires, it would make a lot of sense to have a profile parameter for application/manifest+json.

3. Additions

3.1. Links

The current draft for WAM does not contain any generic element meant to represent links. The closest thing from a link is the ImageResource dictionary, but it only applies to icons and screenshots.

The following requirements from our infoset could all rely on the same links element:

This new links element could share the same PublicationLink dictionary that the default reading order and list of resources will use.

3.2. Default reading order and list of resources

None of the current members of the WAM are a good fit for core building blocks of the WP manifest: the default reading order and the list of resources.

We would need to introduce two new members to the WAM: reading_order and resources.

Like links, these new elements could essentially be a sequence of PublicationLink.

3.3. Timestamps

The WP infoset contains two different timestamps: the publication date and the last modification date.

Both of them would have to be added to the WAM, for example using published and modified.

3.4. Contributors

Listing contributors/creators is a very common requirement for publications, also expressed in our infoset.

The WAM does not contain any element to indicate the creator of a Web App.

We haven't discussed yet how roles should be expressed in our infoset, but we'll need to at least introduce one new member (creator, author or contributor) and potentially more.

3.5. Canonical identifier

Our canonical identifier could be expressed using the new links element with the relevant rel value (identifier) or by defining a new member as well.

3.6. Flag as publication

The addition of a profile parameter to the media type would already help, but in addition to that, we'd need to consider a simple boolean flag for indicating that a WAM is actually a WP manifest: publication.

IMO this can't be handled by display since it serves a completely different role in the WAM.

BigBlueHat commented 6 years ago

We will also need to consider the Navigation scope section. Currently, any link not "within scope" will open in the default browser and not in the customized browser launched by the icon made by the WAM.

HadrienGardeur commented 6 years ago

Right, navigation scope for a Web Publication would be quite different.

Within scope for a WP = part of the reading order or list of resources Outside the scope for a WP = not listed

TzviyaSiegman commented 6 years ago

Thanks, @HadrienGardeur. This is really helpful.

I have some questions:

2.2 Considering the large number of additions that our infoset requires, it would make a lot of sense to have a profile parameter for application/manifest+json.

Would this mean that we would be losing some of the Native functionality of web app manifest? Would UAs even recognize our media-type?

3.2 We would need to introduce two new members to the WAM: reading_order and resources. I think this might be a very minor change. Just adding the concept of sequence to WAM.

TzviyaSiegman commented 6 years ago

@BigBlueHat

We will also need to consider the Navigation scope section. Currently, any link not "within scope" will open in the default browser and not in the customized browser launched by the icon made by the WAM.

My (personal) opinion is that we should limit ourselves to items "within scope" now. We may be able to expand in the future, but we cannot do everything at once. Let's figure out how to publish a single origin publication first.

BigBlueHat commented 6 years ago

@TzviyaSiegman understand. It's just a current change from our current spec.

Our current spec text uses the reading order list as the "scope" whereas WAM restricts them to sub-URLs of the value of scope provided in the WAM: https://www.w3.org/TR/appmanifest/#dfn-within-scope

If the scope is consequently restricted by that value, then we have further work to do on defining the reading order list--i.e. making all the contained URLs relative or (at least) "within scope" when walking through those "string match" steps.

Side note: this is only navigational scope, so presumably one could "fake" the navigation using pushState and load the content from a different origin (CSP, SOP, etc allowing) and present it to the user anyhow...but then...what was actually published? Just a Web App to load remote content and not what anyone would describe as a "publication."

js-choi commented 6 years ago

With regard to @HadrienGardeur’s https://github.com/w3c/wpub/issues/127#issuecomment-363063172, where would readopting JSON-LD (as suggested by @iherman in https://github.com/w3c/wpub/issues/127#issuecomment-362575679) be raised? Would it be in § 2.1 i18n and § 3.1 Links, or its own section?

For what it’s worth, at least one developer (@hsivonen) of a major UA vendor (@mozilla) has expressed antipathy toward JSON-LD and implementing standards that use it (see https://github.com/mozilla/standards-positions/issues/44#issuecomment-341663098 and No Namespaces in JSON, Please”, 2017-05), even when processing for JSON-LD per se is not required (see 2017-05 article’s final section, “But You Can Ignore the Complexity!”). This may have ramifications for vendor inclination toward changes toward JSON-LD in WAM and, indeed, cross-browser implementation of any web standard that includes JSON-LD.

HadrienGardeur commented 6 years ago

@js-choi I've listed what I consider to be "must have" requirements to align with our current infoset.

Support for JSON-LD IMO falls in a different category: "good to have". This would require support for @context and the definition of a JSON-LD context document for the WAM or the WP manifest.

JSON-LD is already widely implemented on the Web and it's pretty much a requirement for AMP as well. Our own use of JSON-LD would not require browsers to know anything about RDF, since the document would be parsable as normal JSON.

iherman commented 6 years ago

@js-choi, maybe we should have a clear idea first on which terms we would have and which of those are really important in terms of JSON-LD metadata. The main argument for JSON-LD is that it links the manifest data to the Linked Data world, e.g., it potentially adds the metadata to the various knowledge graphs that are built on top of schema.org. Some of the manifest data, like the reading order, are not really relevant, for example. By having a clear idea on the terms we can build a better case for this.

HadrienGardeur commented 6 years ago

@TzviyaSiegman

Would this mean that we would be losing some of the Native functionality of web app manifest? Would UAs even recognize our media-type?

I don't think it would, but it's up to the browser to implement different behaviors. Right now, the WAM relies exclusively on the rel value to detect a manifest and at least in Chrome, it doesn't matter what the media type is.

akuckartz commented 6 years ago

@js-choi For what it’s worth, at least one developer (...) of a major UA vendor (...) has expressed antipathy toward JSON-LD and implementing standards that use it (...), even when processing for JSON-LD per se is not required

That should be enough reasons to disregard such antipathy in technical discussions (even when coming from a rather well-known developer of a well-known organisation).

llemeurfr commented 6 years ago

Antipathy toward technologies like XML, RDF (and especially RDF/XML), JSON (I also met people who hated HTML by the way) usually come from a lack of understanding of their "best playground", even when it's the position of very smart people.

As Ivan said in this thread, before putting forward a request for the WAM to adopt JSON-LD, as a "good to have", we'd better list & detail why we think it would be a good idea.

Ivan proposed that "The main argument for JSON-LD is that it links the manifest data to the Linked Data world, e.g., it potentially adds the metadata to the various knowledge graphs that are built on top of schema.org http://schema.org/."

But this may be very abstract to many people. I would suggest we put also forward also that:

and a JSON structure can be made JSON-LD compliant with minimal changes (in our case this is a MUST, e.g. we shouldn't propose that dates have an explicit dateTime type etc.)

Cordialement,

Laurent Le Meur EDRLab

Le 6 févr. 2018 à 21:31, Andreas Kuckartz notifications@github.com a écrit :

@js-choi https://github.com/js-choi For what it’s worth, at least one developer (...) of a major UA vendor (...) has expressed antipathy toward JSON-LD and implementing standards that use it (...), even when processing for JSON-LD per se is not required

That should be enough reasons to disregard such antipathy in technical discussions (even when coming from a rather well-known developer of a well-known organisation).