hypothesis / product-backlog

Where new feature ideas and current bugs for the Hypothesis product live
118 stars 7 forks source link

Consider interoperability in our #annotations fragment syntax #989

Open dwhly opened 5 years ago

dwhly commented 5 years ago

Feature Request Form

Problem you are trying to address with this feature

Right now our bouncer links resolve to an #annotations fragment that passes the necessary annotations and search facets to our client. This syntax is a likely point of intersection with the broader movement towards an ecosystem of interoperable clients and services, and the links that use these fragments are likely to be embedded in the web in larger numbers-- including by those copying and pasting our own bouncer output instead of the initial bouncer syntax.

Your solution

We might consider an #annotation syntax that would work alongside other link syntaxes for deep linking, such as in these W3C notes:

Selectors and States: https://www.w3.org/TR/2017/NOTE-selectors-states-20170223/

And Embedding Web Annotations in HTML https://www.w3.org/TR/2017/NOTE-annotation-html-20170223/

This syntax could direct supporting clients to fetch annotations from indicated services, with filters for users, groups, tags, urls or plain text search terms.

Note previous discussion here: https://github.com/hypothesis/product-backlog/issues/975#issuecomment-478292789

And a broader discussion around deep linking in browsers here: https://github.com/bokand/ScrollToTextFragment/issues/4

BigBlueHat commented 5 years ago

What if we used the rel attribute terminology and a more generic fragment identifier to support other similar use cases of a primary resource + related content (and it's relationship). For instance, using the Web Annotation Protocol's discovery mechanism as a foundation, the examples from #975 would become something like:

https://example.org/#rel(http://www.w3.org/ns/oa#annotationService, https://api.hypothes.is/annotation-collection-id)

which is conceptually/semantically equivalent to requesting https://example.org/ and getting this HTML in the body of the response:

<link rel="http://www.w3.org/ns/oa#annotationService" href="https://api.hypothes.is/annotation-collection-id">

Obviously, the more generic/flexible the API, the more verbose it naturally becomes...so perhaps we could look into the return of the "annotation" relationship from HTML 1.0 (see "page" 37 in https://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt). 😸

So, the above might (someday) become:

https://example.org/#rel(annotation, https://api.hypothes.is/annotation-collection-id)

Lastly, the other parameters mentioned in the other examples would probably be best put directly on the annotation (or annotation collection's) URL as either a fragment or a query parameter. Otherwise, there'd be an endless space of (probably) idiosyncratic key/value pairs cluttering up potentially usable space for future specs. So... #annotation(service="https://api.hypothes.is",user=USER_ID,tag=TAG) could become #annotation(https://api.hypothes.is?user=USER_ID&tag=TAG).

At any rate, thanks for starting this conversation @dwhly!

judell commented 5 years ago

A couple of questions about fragment-related interop:

  1. Early SPAs relied on fragments. Now, thanks to the history API, they don't have to. Should interop with frag-dependent SPAs be a non-goal?

  2. Other fragment-oriented syntaxes are not namespaced:

http://example.com/data.csv#row=4 (not #data:row) http://example.com/foo.mp4#t=10,20 (not #media:t) http://example.com.pdf#page=35 (not pdf:page)

annotations: is already a more robust namespace, which seems useful. Yes, Hypothesis and Genius clash but could avoid doing so by qualifying the fragment (annotations:hypothesis:query)

With something like https://example.org/#rel(annotation:, ...), @BigBlueHat, would you envision combinations like https://example.org/#rel(annotation:, ...),(pdf:...) ?

BigBlueHat commented 5 years ago
1. Should interop with frag-dependent SPAs be a non-goal?

Some SPAs still use fragment identifiers to support full-screen display or simply to avoid potential server hits at more than one URL. Additionally, Google Analytics (etc) sometimes use fragments for tracking codes, so being able to mix these new fragment identifiers seems like an important feature for developers.

2\. Other fragment-oriented syntaxes are not namespaced:

Those aren't name spaced because they are the only fragment systems supported by those formats--which is the current case with DOM id/name-based fragment selectors in HTML. The XPointer Framework style selectors were partly designed to provide an extensible/mixable fragment style for XML based formats such as SVG. Consequently, this is why SVG fragment identifiers use #svgView(viewBox(0,200,1000,1000)) style fragments--because #roundThing is also valid identifier and would point at a "named object" such as <circle id="roundThing"> (as in HTML, XML, etc). SVG also supports media fragments (ex: #xywh=50,50,300,300) and their spec explains how to concatenate these via & also: https://www.w3.org/TR/SVG/linking.html#SVGFragmentIdentifiersDefinitions

Ultimately, HTML could benefit from something like that--with (ideally) an future-friendly foundation like the XPointer Framework (i.e. #func()).

With something like https://example.org/#rel(annotation:, ...), @BigBlueHat, would you envision combinations like https://example.org/#rel(annotation:, ...),(pdf:...) ?

Fragment identifiers are currently MIME type defined, and sadly most media types (with the exception of XML afaik) haven't provided an extensible foundation for fragments...so that combination would only be "valid" if example.org responded with a PDF. Pragmatically, though, fragment identifiers are used by clients, so the clients can (and do) support other fragment identifiers regardless of media type (which of course is especially true in the JS-enhanced HTML world).

Practically, I'd expect the combinations to happen with the native/media-type defined fragment as the "foundation" and the extensible-future-friendly-xpointer-ish-things as the add-ons:

http://example.com/data.csv#row=4&rel(annotation, https://hyp.is/annotation-id)
http://example.com/foo.mp4#t=10,20&rel(annotation, https://hyp.is/annotation-id)
https://example.org/awesome-science.pdf#page=35&rel(annotation, https://hyp.is/annotation-id)

The consequence being that supporting clients would need to mix-in or ignore the rel() (or other xpointer-style things). Many of these other default fragment identifier systems already support & separate and pre-defined keywords, so mixing in XPointer style (i.e. rel()) should be ignored in a flexible/forward-thinking client. However, that's not the case for HTML where anything after # is technically (i.e. currently) defined as part of the "named identifier" (I did some digging into this last year if you're curious 😉).

So, in sum (sorry this got long), my hope is for a style of fragment identifier that can be mixed among existing ones and that can also provide a foundation for future identifier (or as here, identifier "mixing" systems). Right now, that seems to be the XPointer style (#func()) concatenated with & and mixed alongside existing media type based fragment identifiers (as shown above).