w3c / wpub-ann

Web Annotation Extensions for Web Publications
https://w3c.github.io/wpub-ann/
Other
6 stars 10 forks source link

Extend TextPositionSelector and DataPositionSelector #15

Closed tcole3 closed 6 years ago

tcole3 commented 6 years ago

This PR defines a meaning (identifying a position in the stream rather than a range of characters or bytes) for when start and end are equal (previously undefined).


Preview | Diff

tcole3 commented 6 years ago

See issue #9 for additional context and discussion. The introduction of a way to do side-bias (cf. EPUB CFI) is incomplete. We need an example of someone actually having used CFI side-bias, because I don't understand the use cases for having a locator with side-bias.

azaroth42 commented 6 years ago

Edit: Having scrolled the edits to the right in my view, I see the "between" now. Sorry, your math is correct!

-0. No text is actually selected. The only utility I can see is a character level bookmark for reading position -- are you at the bottom of the page, or the top of the following one. Seems like solving a very very very small edge case.

iherman commented 6 years ago

I think this is a 👎 for me:-(.

My problem is as follows. All other selectors, at least for HTML/XML content, select something that is in the DOM. Either a DOM element node, a text node, or a range thereof. Ie, implementations can be implemented on top of the DOM representation, which is readily available in, say, a browser (@azaroth42 or @BigBlueHat should tell me if that is correct, they have implemented this stuff). However, we introduce here a very very different notion, namely the selection "between" two characters (possibly between to DOM nodes). That would really complicate things for, as @azaroth42 puts it, a very very very small edge case.

Also: having this "bias" attached to one specific selector, namely the TextPositionSelector, does not sound o.k. to me either. After all, among all selectors, this is the most fragile one against change in the text, ie, if we give some sort of a bias, then adding something to the TextQuote selector would make more sense (if we understand what that means, of course).

Maybe a different approach is to stay as abstract as EPUBCFI, ie, introduce the "bias" property for a Selector in general as an additional attribute, and considering it as a "hint" for implementations that MAY use it. But we really need use cases.

tcole3 commented 6 years ago

Okay, this afternoon I will revise this PR to restore the TextPositionSelector and DataPositionSelector definitions. I agree that my first attempt overloaded the existing types and not in a good way. Also, as was mentioned (cf. #9) by @iherman changing any of the Web Anno definitions is a slippery slope, and as mentioned (at some point) by @azaroth42 a natural programmer response with the current position selectors would be that start=end simply selects an empty range.

All this said, I still think it worthwhile (in anticipation of likely discussions at f2f) to introduce into this draft additional types that can be used to identify and reference a position in a byte or normalized text stream. As @iherman says this is fundamentally different than selecting a range, but I am open to the idea that there may be use cases in regard to Web Publications and I do not find specifying a way to do so inconsistent with this document's scope, especially if done as a separate type of selector/locator. So rather than close the PR immediately, I will make another try and modify this PR to add two new types: TextPositionLocator and DataPositionLocator.

To keep existing Selector definitions intact for now, I will include bias only for these two new types, pending use cases. But I am open to approach suggested by @iherman if use cases warrant.

If no use cases come forward over the next 4 weeks requiring a way to identify and reference positions in text / byte stream (as opposed to selecting a range of the text / byte stream), we can kill PR then.

iherman commented 6 years ago

@tcole3 +1 to all that, but I think just for our own sanity we should close this PR without further action and do what you plan to do in a different PR. Just an idea...

tcole3 commented 6 years ago

Okay, doing so now.