Should the last beforeinput to occur before compositionend be cancellable?

BoCupp-Microsoft commented 2 years ago

@masayuki-nakano suggests that the last beforeinput event to occur before compositionend should be cancellable in this comment. Cancelling the beforeinput event would remove the composed text instead of committing it to the DOM.

Creating this issue to discuss whether this should be the case.

BoCupp-Microsoft commented 2 years ago

One of the reasons you cited, @masayuki-nakano, is that the author would need to implement undo/redo themselves if they were to use the approach I suggested here. As far as I know, however, if the author is using script to insert the composition text themselves, no matter the mechanism they used to remove or prevent the composed text from being inserted into the DOM, the author will need to implement undo and redo themselves since scripted changes to the DOM are not undoable.

Could you clarify the scenario you'd like to make work?

DavidMulder0 commented 2 years ago

Just want to express a strong 'in favor'.

I do want to note however that I am confused by

I believe they are free to do so using the last reported range of the beforeinput event which precedes compositionend.

As my understanding would be that this wouldn't account for recomposition.

So, initial state is "The homme is large". Then the user clicks into 'homme', so we have "The hom|me is large" (| = cursor, and italics is the active recomposition zone). One of the text suggestions is next, 'hommels', so they press it, thus the text changes to "The hommels is large" and once you hit space "The hommels is large". I once again forgot (I look at this about once per 3 months 😅🤓) exactly when composition starts and ends in that process, but deleting the last targeted range wouldn't result back in "The homme is large", but probably in "The is large".

masayuki-nakano commented 2 years ago

Oh, I didn't realize cc'ed, sorry (and anyway I don't have much time to check widely due to not good condition after 3rd vaccination).

There is a scenario, yes. Traditionally, Gecko used a set of composition events to represent text input which is not simply comes from keyboard event. For example, one key press inputs multiple characters on Linux and macOS or text input without keyboard, e.g., choosing one of accented character from a popup on macOS. Then, beforeinput event is dispatched with inputType="insertCompositionText" at initial implementation of beforeinput event in Gecko.

However, we got a bug report that such input is not cancelable from beforeinput event listener only in Firefox. Then, we investigated the difference between Gecko and the other browsers. Then, the other browsers surprisingly dispatch only beforeinput event whose inputType value is insertText which is of course cancellable (i.e., the event does not follow keypress nor compositionupdate, which means that legacy apps which do not use beforeinput event cannot handle the input). Therefore, I switched Gecko follows the other browsers behavior to make the beforeinput event cancelable.

This means that the applications which hit this compatibility issue are not aware of IME because text input caused by IME composition is not cancelable only with beforeinput event listener. Therefore, for making IME users in the world happier, I think that beforeinput event for committing composition should be cancelable. Then, web apps do not need to have special handler for IME. I.e., some web apps may become IME aware from browser's side change.

FYI: see also https://github.com/w3c/uievents/issues/202. The issue does not directly related to this, but web apps cannot know composition commit timing with input event too in Chrome because of isComposing is set to true in any input event caused by IME composition.

What I'd like to say is, current beforeinput and input events are not enough to handle IME composition because of those issues. I believe that these issues should be fixed in the spec and browser side for making handling text input from any input source easier and be possible only with beforeinput and/or input event listeners.

Azmisov commented 1 year ago

Why can't we just implement the deleteCompositionText + insertFromComposition events from Level 2 instead? The purpose of those two events was to allow for this case specifically, e.g. deleteCompositionText clears out the IME text, and canceling insertFromComposition keeps it that way. Having the last beforeinput event cancelable is so messy in comparison.. you have to repeatedly call preventDefault, cache the event data, and then use the separate compositionend event to handle that data. I don't see the merit to adding new features to the Level 1 spec when you can just implement parts of Level 2.

masayuki-nakano commented 1 year ago

Why can't we just implement the deleteCompositionText + insertFromComposition events from Level 2 instead?

Basically, beforeinput and input events should be fired as a pair except when beforeinput is canceled. Therefore, using the Level 2 proposal as-is may break web apps in the wild which do not check inputType of input events since it looks like that web browsers suddenly start to dispatch redundant input events from point of view of such web apps.

Azmisov commented 1 year ago

I would counter that deleteCompositionText and insertFromComposition are already implemented and fire in Safari/Webkit, so those web apps are already broken.

masayuki-nakano commented 1 year ago

I would counter that deleteCompositionText and insertFromComposition are already implemented and fire in Safari/Webkit, so those web apps are already broken.

Yes, it is. And some web apps may not support Safari/WebKit especially when they are available only with specific browsers like web apps in intranet. Spec changes should not accept any potential backward compatibility risk.

Azmisov commented 1 year ago

I think you're too worried about a hypothetical case. InputEvents spec already introduced many new inputTypes to input event, so that was already a breaking spec change by this definition. Any web developer reading the InputEvents spec would have realized browser support is in flux, and a browser may implement Lvl 2 events at any time (e.g. the addition of insertFromComposition). Even in my code, I assumed additional event types might be added in the future. What if I had a web app that relies on e.preventDefault always being a no-op for insertCompositionText? The original proposed idea would break my web app instead of the one that doesn't check e.inputType in input event.

I'll defer to whatever the spec editors decide, but I would prefer more parity with Lvl 2 over supporting quirky web apps.

johanneswilm commented 1 year ago

Spec changes should not accept any potential backward compatibility risk.

@masayuki-nakano Using that logic, no browser can ever change anything about execCommand or contenteditable - yet they all are.

I agree with @Azmisov that implementing level 2 everywhere would be the best solution seen from the perspective of a web developer. The reason why we cannot do that is that apparently there is something in Android IME that is beyond the control of the Chrome developers we are talking to that makes it impossible to implement these events on Android and therefore the Chrome team has decided that they will not implement some specific parts of level 2 even on desktop. It severely cripples the specification, but there should hopefully be some relief with EditContext in the not too distant future.

masayuki-nakano commented 1 year ago

Different from execCommand etc, input event is widely used even by non-editor applications to update visual feedback, optimizing UX, etc, so the impact of its behavior change is more terrible than the APIs which are used only by editor apps.

I wonder, is there a reason why it should not make the last beforeinput event of committing composition cancelable? I still believe that it's the simplest and safest solution for canceling IME composition only at committing composition.

Azmisov commented 1 year ago

I've been thinking about this, and I think it may be fine if cancelable was set to true for that final event. If we see e.cancelable == true && e.inputType == 'insertCompositionText', the browser should guarantee:

the composition has been deleted already from a previous empty insertCompositionText; probably not a good idea to have the content pre-deleted without a notifying event, since that's not what insertCompositionText does currently, and we don't want to change the behavior just for that final event (canceling it should be opt-in I think)
deleteCompositionText/insertFromComposition won't be fired
if e.preventDefault is called
- defaultPrevented is set to true
- a compositionend will fire next (not necessarily ending the composition, but simply signaling that that portion of composition was to be committed to the DOM as described here)
- we are safe to manipulate the DOM during the beforeinput event handling without quirky behavior

Otherwise, assume the standard Lvl 2 spec events may/may not fire prior to compositionend (depending on browser support). I'll reiterate I don't think your argument about backwards compatibility is very strong. And this proposal I think could also break some obscure apps as well, e.g.:

div.addEventListener("beforeinput", e => {
    e.preventDefault();
    if (e.defaultPrevented){
        // switch statement with all lvl 2 cancelable inputType's
        switch (e.inputType){
            case "insertText": 
                // insert manually ...
                break;
            // ... all other lvl 2 cancelable inputType's here ...
            // e.g. not including insertCompositionText since uncancelable in all browsers
            default:
                throw Error("unexpected event was prevented")
        }
    }
});

This code would break if we added any additional inputType and if we started changing which events are cancelable. Not very robust code of course, but neither is code that doesn't check the inputType. I think 99% of cases that use input without checking inputType are for reactive applications that update their state in response to changing input, and those I think would be unaffected by additional inputType's. Finally, it just makes it more annoying to developers if there are two differing implementations and you're making cross-browser apps. This isn't to argue we shouldn't do anything. If the only thing browsers are willing to implement is canceling the final insertCompositionText, I'd rather have that (with cancelable set and guarantees listed above) than nothing.

Can we discuss what exactly "deleting the composition text" will do? The spec doesn't go into details for what the deleteCompositionText event should do, and I'm having trouble triggering it in Safari (I'm testing in a MacOS VM). The purpose of beforeinput is to have predictable control over the DOM's state during editing. For all but composition events, you can guarantee this by canceling the event and manipulating the DOM yourself. I would hope that there would be some way to control the DOM cleanly for composition as well. This isn't really a concern for plaintext, since usually you don't care what text was entered, just what styling elements were changed. Here are some examples from my testing of DOM manipulations composition does:

Firefox:
(type) <b>text te</b><i>ex|</i> → <b>text <i>teext|</i></b><i></i>
(type) <span class='b'>text te</span><span class='i'>ex|</span> → <span class='b'>text teext|</span><span class='i'></span>
(type) text te<u>ex| text</u> → text <u>teext| text</u>

Chrome:
(type) <b>text te</b><i>ex|</i> → <b>text te</b><i>ext|</i>
(autocomplete) <b>text te</b><i>ext|</i> → <b>text texted |</b>
(type) <span class='b'>text te</span><span class='i'>ex|</span> → <span class='b'>text te</span><span class='i'>ext|</span>
(autocomplete) <span class='b'>text te</span><span class='i'>ex|</span> → <span class='b'>text texted </span>
(autocomplete) text te<u>ex| text</u> → text texted|<u> text</u>

Notice that composition events will delete elements, delete text nodes, leave elements empty, create new elements, and move text to different elements. If you'd like to test yourself, I've made Input Events Tester site to see how different browsers manipulate the DOM.

I really think that "deleting the composition text" should actually revert the DOM to its state at the beginning of composition. The target ranges (either for insertFromComposition or a final cancelable insertCompositionText) should then be defined with respect to this reverted DOM. If left uncanceled, the browser can proceed as it pleases, deleting/creating nodes in whatever manner it wants. But when canceled, it will be just like every other inputType, without unexpected resulting DOM state. If "deleting the composition text" doesn't revert the DOM, developers are forced to use mutation events, DOM diffing, or some other complex logic to try to track the target ranges (which may change mid-composition) to do it themselves. But at that point, there's almost no point in using beforeinput (for apps that want to support composition) since you have to do all the work yourself like the old way of doing things. I'd argue also the browser can do the DOM reversion a lot more efficiently than with the DOM API's, since 1) the browser knows its own logic for doing edits 2) browser has more innate knowledge of what ranges the composition will or might modify 3) compiled language. Perhaps internally it tracks a range for the bounds of the composition, clones that range, modifies it as usual during composition, and then replace the range with the original cloned copy.

Of course, reverting the DOM just acts to encapsulate the composition edits. While composition is happening, you'll still get whatever edits the browser applies, and unfortunately it looks like we need to wait for EditContext or other API to be able to prevent that fully. E.g. If writing a code editor, the syntax highlighting won't work with compositions, since browser will combine the text with the styles of previous identifiers/variables/etc. However, I have discovered a workaround to allow some degree of limited control for what style the browser applies: You can prepend a zero-width unicode space (u200B) to your styled element. Visually you won't see anything, but the space breaks up the IME composition so it is restricted to a single text node. This wouldn't work for IME's that operate across spaces though (if such a thing exists).

I think in addition to reverting the DOM as described above, I think the last thing needed to cleanly implement editors is some way to control the bounds of composition. That way, the jankiness in composing+reverting can be minimized. For example, if I have a <tag>text</tag><tag>text</tag>, there needs to be some guaranteed way to tell the browser that the text in those two elements are unrelated and shouldn't be strung together in a composition (where my hack is to prepend a zero-width space to indicate this). this can be as simple as an HTML attribute compositionborder="true". Secondly, as the composition proceeds, there should be some way to indicate that composition should stop... the browser already does this when you press space, period, parentheses, etc, why can't we examine the text ourselves in JavaScript and notify the browser that composition should stop? E.g. we need some kind of requestCompositionEnd method for beforeinput. An example is typing "text-text", where if it were a math/programming expression, you want the hyphen to be interpreted as an operator, and so should be interpreted like a period/space as in human language. So the idea is slightly different from preventDefault... we don't want to cancel the composition, we just want to notify the browser the composition string should end. I was looking at Firefox's source documentation, and they have a requestCompositionEnd-like API internally already. I suspect other browser have something similar, so it would be nice if that API could be exposed via DOM API's. There seem to be somewhat reliable, hacky methods people use to stop composition already: changing window selection or blurring, then quickly putting cursor back or focusing again (see here or usage in CodeMirror); point being it is possible in current browsers, just there is no API to expose it.

So the purpose is not necessarily to control the exact text that is entered, but more the DOM structure around that text. I'll elaborate here that requestCompositionEnd would simply cache whatever data or aggregate inserts/deletes state was at the point-in-time of the accompanying insertCompositionText event. Whether the IME needs to be restarted, reset, etc does not matter. The idea is you'd call this when you want to enclose some or all of the text in a compositionborder. For the hyphen example earlier, the insertCompositionText will fire with e.data=='text-'; this text is cached in requestCompositionEnd and then we insert it into the reverted DOM later during insertFromComposition event like so: <span compositionborder=true class='variable'>text</span><span compositionborder=true class='operator'>-</span>, placing the cursor after the operator span.

The usage flow would be like so: Listen to beforeinput and examine the text for operators or other special characters that you need to style differently; call requestCompositionEnd() when this is detected. Call preventDefault in the subsequent insertFromComposition or final insertCompositionText (alternative API). The DOM is as you left it before, so you can modify it without any undoing/tracking changes. For the newly inserted special characters you detected, wrap them in an element with compositionborder="true". Browser will restart composition, but this time it will continue with just the special characters, or perhaps another empty text node that you created if the special characters were singletons.

CodeMirror is a good example of a widely used library that would benefit from this. Try typing this-125 on their homepage JavaScript demo using Chrome Android. The string will stay black without syntax highlighting, until some polling happens to trigger the composition to cancel and it updates highlighting. With a requestCompositionEnd and compositionborder (and of course reverting the DOM as described before so that composition is controllable), CodeMirror could be updated so there wasn't that jank in style update on mobile.

TLDR; That's a lot of talking, so here's a summary of what I'm proposing:

Allow the final insertCompositionText to be canceled if browser devs are unwilling to do the deleteCompositionText/insertFromComposition flow― better to have some way of doing it then none. How it should behave described at start of this comment. If implemented as I described, it should be as simple as (e.inputType == "insertFromComposition" || e.inputType == "insertCompositionText" && e.cancelable) to support both APIs in code.
The DOM should be in a reverted state when insertFromComposition, or a final cancelable insertCompositionText is dispatched; the event target ranges and data should be with respect to this reverted DOM. Reverting the DOM should occur in deleteCompositionText or an empty insertFromComposition immediately before this event.
Add an HTML attribute compositionborder='true' or similar which signals to the browser that compositions should not cross the borders of the element; composed text should be restricted to that element and any children that do not have compositionborder set.
Add a requestCompositionEnd() to beforeinput, which tells browser to finalize the composition string (not cancel) and thereafter dispatch the deleteCompositionText/insertFromComposition, final insertCompositionText (alternative API), and subsequent events.

w3c / input-events

Should the last beforeinput to occur before compositionend be cancellable? #134