whatwg / dom

DOM Standard
https://dom.spec.whatwg.org/
Other
1.56k stars 290 forks source link

Range#expand() #431

Open foolip opened 7 years ago

foolip commented 7 years ago

https://jsbin.com/valihix/edit?html,output

This test shows existence of the API in Blink, EdgeHTML and WebKit, but not Gecko.

It might make sense to standardize alongside https://github.com/w3c/selection-api/issues/37, possibly using some shared logic.

It has been deprecated in Blink for a very long time, tracked by https://crbug.com/445218, but given the level of support I now think we should first investigate standardizing.

foolip commented 7 years ago

The API surface would be:

partial interface Range {
  void expand(DOMString unit);
};

The unit values are "word", "sentence", "block" and "document". Any other values are no-ops. If we really wanted to maybe we could use an enum which would result in an exception instead.

annevk commented 7 years ago

@littledan hey, it seems https://tc39.github.io/ecma402/ doesn't expose anything around word/sentence/etc. boundaries. Is that planned? I think ideally we make sure we use shared primitives to define these aspects, even if below they end up depending on platform conventions and libraries.

littledan commented 7 years ago

Yes, it is proposed, in https://github.com/tc39/proposal-intl-segmenter , currently at Stage 2. I'm not really worried about a mismatch in semantics, as probably browsers will implement both of them using the same underlying internationalization libraries (usually ICU).

That draft spec isn't really well-factored in terms of exposing something HTML can call into--should it be?

The granularities that Intl.Segmenter exposes are grapheme word sentence and line. It's nice to see that two of those line up with the same name and the same semantics. What does block mean--is this like selecting until the next hard line break?

FWIW Intl.Segmenter's constructor throws an exception if you pass in a value that's not one of those four.

annevk commented 7 years ago

@rniwa ^^

I think it would be nice if all three API entry points could call into the same underlying primitive. Obviously Selection and Range have to do some processing around that for it to make sense for their contexts.

rniwa commented 7 years ago

No, what's exposed to ICU is completely different from what we use to modify selection at least in WebKit on Mac and iOS, and they're intentional. For example, ICU on iOS would treat each character in CJK as a separate word for line breaking purposes but for the purpose of selecting text, it would treat multiple characters as a single word. There are many other edge cases and special cases for selection, line breaking, word wrapping, etc...

annevk commented 7 years ago

Sure, but instead of ICU you could use your own implementation. Nothing here is requiring ICU. The idea is just that they share the same implementation.

rniwa commented 7 years ago

Well, on Mac, for example, we emulate clicking on a word (using AppKit) in order to determine the word boundary, and that's nothing to do with ICU. In fact, there is no way for us to expose that to Intl.Segmenter. Also, we do use ICU in other places. Modifying selection is just very special.

domenic commented 7 years ago

In fact, there is no way for us to expose that to Intl.Segmenter

Does that mean you're unable to implement the "word" part of the Intl.Segmenter proposal? If so it would be good to raise this at TC39, as the proposal is pretty far along in the process.

rniwa commented 7 years ago

In fact, there is no way for us to expose that to Intl.Segmenter

Does that mean you're unable to implement the "word" part of the Intl.Segmenter proposal?

No, we can totally implement that for Intl.Segmenter because we have ICU, and we use that for word breaking, etc... what I'm saying is that the word boundary behavior for selection is completely different from what we use for line breaking and Intl.Segmenter. It requires a hit testing in AppKit.

As a general rule, I don't think it's possible to spec almost anything about how selection behaves in WebKit on Mac & iOS based on a lower level primitive due to the way our stuff is integrated with platform APIs, and they're only remotely related to ICU's word boundary definitions on either OS.

littledan commented 7 years ago

This seems entirely reasonable to me. Seems like we can't expose a text-only primitive for Range#expand, even if we might be able to for line breaking (given the right tweaks).

rniwa commented 7 years ago

For your entertainment, TextBoundaries.mm#L183 is the code in WebKit that uses doubleClickAtIndex in AppKit. It's absolutely horrifying but nonetheless, that's what we use for word breaking in our editing / selection code.

annevk commented 7 years ago

I'm really curious why hit testing cannot be avoided. I'm also somewhat interested in what other implementations end up doing. If they all end up doing the same as Intl.Segmenter, having the abstraction could still be useful, even though we could allow difference for selection. (It's a little sad that Range ends up being impacted as well, as it's sorta supposed to be a non-UX API.)

rniwa commented 7 years ago

Sorry, hit testing was a red-herring to say. I don't think AppKit does actual hit-testing. It's likely doing some magic with ICU. I said hit testing because the method is called doubleClickAtIndex. It doesn't go through any sort of hit testing in CSS boxes or any DOM nodes for that matter.

My point was mainly that we get a string and pass it to AppKit, and AppKit does something with it, which is completely opaque to us. So we can't really describe it in terms of ICU operations.

annevk commented 7 years ago

Sure, but we shouldn't describe anything in terms of ICU operations, since ICU is just an implementation. The actual description needs to be flexible enough either way.

domenic commented 7 years ago

I guess the question in terms of implementations is whether WebKit/JSC would plan to implement Intl.Segmenter's word mode using ICU or AppKit.

rniwa commented 7 years ago

@annevk : well, the fundamental problem here is that we use ICU for all other word breaking purposes except editing & selection. I was using the term "ICU" as a way of referring to the word boundary definition for Intl.Segmenter. To put it another way, the word boundary behavior for selection can't be same as the one used for line breaking and Intl.Segmenter etc...

I don't think Web authors want our word boundary behavior for selection to be the word boundary behavior of Intl.Segmenter either. For starters, they have specific quirks that are only desirable and useful in the context of selecting a word. And it's going to be extremely slow.

rniwa commented 7 years ago

I guess the question in terms of implementations is whether WebKit/JSC would plan to implement Intl.Segmenter's word mode using ICU or AppKit.

We definitely don't want to be implementing it using AppKit because it doesn't expose enough information about other granularities ICU does: https://developer.apple.com/reference/foundation/nsattributedstring?language=objc

domenic commented 7 years ago

For starters, they have specific quirks that are only desirable and useful in the context of selecting a word.

That's what I was missing, thanks.

yosinch commented 7 years ago

I would like to oppose standardize Range#expand(). I prefer to get rid of Range#expand() from browsers. Reason of my objection of standardize Range#expand() are:

  1. DOM base Range#expand() isn't useful since I'm not sure about use case and DOM base can't handle collapsed whitespaces,, or it can be done by Intl.Segmenter+Range.toString()
  2. Visible character base Range#expand(), Blink and WebKit do so, is hard to spec'd, e.g. Element#innerText. I think visible character in DOM spec is layering violation.
  3. Implementation of Range#expand() and Selection#modify() should be different == it is hard to share code, e.g. Range#expand() may not support line and page unit. Result of word unit on Selection#modify() are different among platforms, e.g. Windows includes trailing whitespace but others don't.

I think it is better to provide "visible character iterator" as API, in editing spec, and using Intl.Segmenter is better than providing high-level API Range#expand().

annevk commented 7 years ago

@yosinch are you in the position to get it removed from one of the three browsers mentioned in OP?

rniwa commented 7 years ago

FWIW, we will not attempt or consider to remove this API at this moment due to backwards compatibility concerns.

yosinch commented 7 years ago

@annevk, I'm working for Blink at Google, and owner of Blink editing. I'm also a member of editing-tf@ For Blink, usage of Range#expand() is low and we would like to reduce maintenance cost, I would like to remove Range#expand().

annevk commented 7 years ago

@yosinch ah okay, if you're successful in removal we can file a bug against Edge to see if they can consider it too. Maybe then down the line WebKit can reconsider. I'm happy to delay standardizing contentious APIs.

foolip commented 7 years ago

@yosinch, do you know any editing people working on Edge? Getting their thoughts on this and Selection#modify() would be good.

Compat risk wise, Range#expand() has lower usage than many things we have successfully removed, but it would still require httparchive analysis to get an idea about what the breakage would look like. Unfortunately this API is not very grep friendly, my experiments with BigQuery are quite noisy. Happy to help dig in that if you want to pursue removal though.

yosinch commented 7 years ago

@yosinch, do you know any editing people working on Edge? Getting their thoughts on this and Selection#modify() would be good.

I don't know Edge people working on editing. Last time editing meeting in TPAC , Travis joined for discussion.

foolip commented 7 years ago

@travisleithead

gked commented 7 years ago

I am of the same opinion as rniwa. We should still be backwards compatible with the web. EdgeHtml will keep the implementation as is. We could, of course, deprecate it in the spec so authors would be discouraged from using it.

annevk commented 7 years ago

@gked it's already maximally deprecated by not being in the DOM Standard to begin with.

foolip commented 7 years ago

@gked, do you have any sense of how interoperable the EdgeHTML implementation is with Blink and WebKit? Does it operate on the DOM, the layout tree, or is it all tied up with editing as it is in Blink/WebKit?

annevk commented 6 years ago

@yosinch have you made any progress here?