w3c / sync-media-pub

Repository of the Synchronized Multimedia for Publications Community Group
http://w3c.github.io/sync-media-pub
Other
17 stars 4 forks source link

JSON base URL for resolving 'text' URL 'fragment' is base URL of HTML document specified as "alternate"? #28

Closed danielweck closed 3 years ago

danielweck commented 4 years ago

I think that 'text' URLs should be just like 'audio' ones, i.e. not restricted to "fragment" which would normally resolve against the base URL of the JSON resource, here being forced to corrspond to that of the HTML document associated with the media overlay (i.e. as specified by the "alternate" property of a web publication's link object, or somehow inferred from the host HTML document itself in case the JSON resource is referenced from an HTML-embedded meta link)

See original comments in the Readium project:

https://github.com/readium/architecture/issues/109#issuecomment-578432821

qnga commented 4 years ago

Are you aware of the open issue #26?

danielweck commented 4 years ago

Thank you for reminding me of this other issue, @qnga

However here in this issue I would like to place the discussion focus primarily on the fact that the JSON document should (IMO) have a well-defined base URL (usually implied by the URL from which its was fetched, as there is no equivalent of HTML <base> overriding mechanism). Consequently, any link/href/URL relative references inside the JSON would be resolved against the base URL.

marisademeglio commented 4 years ago

I think that solution (2) proposed by me in #26 would help resolve this, no?

danielweck commented 4 years ago

Some cross-posting from the Readium repository:

1. The original text fragment syntax came from this draft, contributed by @danielweck :
   https://github.com/w3c/sync-media-pub/blob/267ef4b44ddb49789196755a08f71ba87ed88751/web-proposal.md#the-sync-media-json-format

Thanks for unearthing this Marisa, it's helpful (I stand corrected, I indeed wrote this proposal at the time). Note: this Markdown document's new location is https://github.com/w3c/sync-media-pub/blob/master/drafts/web-proposal.md

So, the thought process behind this particular spec. "tweak" in my initial draft was to explore the processing model specifically for when a JSON resource is directly referenced (via linking, or embedding) from an HTML document (i.e. without the WebPubManifest level of indirection), in which case the location of the JSON document itself does not necessarily have to be used as its "base" URL/URI/IRI, as this could instead be inferred from the embedding context.

Around the same time, there were discussions in the Web Publications group about "base" in JSON / JSON-LD, notably regarding the impact of "opaque" origin and null base: https://github.com/w3c/pub-manifest/issues/12 https://github.com/w3c/json-ld-syntax/issues/103 https://github.com/w3c/json-ld-syntax/issues/23#issuecomment-438449489 See how in JSON-LD, the context @base can be used: https://json-ld.org/spec/latest/json-ld/#base-iri

...so, to wrap-up, I personally feel very uneasy about my initial draft (use short URL fragment syntax, and assume "base" URL is the associated HTML document), but I also feel uncomfortable about creating an ad-hoc JSON syntax that allows overriding the "base" of the JSON resource for specific properties (i.e. text, ...and maybe even audio). I can of course totally see the benefits from an authoring perspective (i.e. less repetition), so I am keeping an open mind.

I wonder about prior art? Web App Manifest immediately comes to mind, for example see the scope and start_url properties: https://www.w3.org/TR/appmanifest/#scope-member https://www.w3.org/TR/appmanifest/#start_url-member ("using manifest URL as the base URL")

marisademeglio commented 3 years ago

The new draft does not have this issue