Open mathiasbynens opened 1 year ago
Firefox developers must have intended to work it with String.fromCharCode(event.charCode)
, and IIRC, a pair of WM_CHAR
is sent by Windows for a surrogate pair input.
IIRC, a pair of WM_CHAR is sent by Windows for a surrogate pair input.
@masayuki-nakano Thank you for clarifying this. Since this is the case, it would seem that there is some misalignment with Safari because Safari is only available on macOS and macOS dispatches the entire surrogate pair for an input.
With this information, there are three viewpoints to this problem:
Chrome | Firefox | Safari | |
---|---|---|---|
Windows | 2 | 2 | N/A |
macOS | 1 | 1 | 1 |
Chrome | Firefox | Safari | |
---|---|---|---|
Windows | 1 | 1 | N/A |
macOS | 1 | 1 | 1 |
From the browser perspective, we should normalize behavior across platforms based on the (1) or (2). This implies the following tables:
Table (a)
Chrome | Firefox | Safari | |
---|---|---|---|
Windows | 2 | 2 | N/A |
macOS | 2 | 2 | 2 |
Table (b)
Chrome | Firefox | Safari | |
---|---|---|---|
Windows | 1 | 1 | N/A |
macOS | 1 | 1 | 1 |
I suggest we align with the OS perspective for the following reasons:
keypress
is deprecated and already has cross-browser misalignment, (2) and (3)(b) are not practically expected by the user.I think that it's not good approach to align to OS behavior. I guess that most web developers do not check/test key input behavior in all major platforms per browser. Therefore, inconsistent behavior between OSes may make end users inconvenient.
keypress
shouldn't be used for text input handling in new web apps. Therefore, the existing web apps using keypress
events for the purpose may be not maintained. If so, fixing incompatible behavior may cause breaking some of them.
On the other hand, I don't know keyboard layouts which have a key to input a non-BMP character. Therefore, this issue may appear only in specific environments. (Note that like Emoji palette in each OS, browsers do not handle them as a key sequence, therefore, this is really a special case for most users.)
FYI: a bug in Firefox
Ah, this may be a dup of #227 (although it's InputEvent).
Note that if you enter e.g. an emote using the Windows On-Screen Keyboard then that will be expressed as two distinct keydown/keypress/keyup sequences, with the keypress part of each describing one of the two UTF-16 surrogates.
For those events, though, the VKEY value is "PACKET" - the keydown/keyup events are essentially a platform-specific quirk that conveys almost no extra information - so it could make sense for the browser to simply drop those events, and coalesce the WM_CHARs into a single valid Unicode code-point.
As Masayuki points out, though, given that the keyCode field and keypress event are deprecated and their spec is normative should it really define a behaviour, or simply document the UTF-16 & UCS4 models for keypress
events, and tweak the MUST wording for the implementation of keypress
for non-BMP?
@drwez When using the On-Screen Keyboard, Internet Explorer and Edge both send a single event. This is not a Windows platform issue, it's not an OS issue.
The bad behavior is specific to Firefox / Chrome on Windows, and unfortunately it does cause real-world problems:
https://github.com/Pauan/rust-dominator/issues/10
https://github.com/rustwasm/wasm-bindgen/issues/1348
It's a widespread problem that affects many events, not just keydown/keypress/keyup. Even idiomatic events like input
also have the same problem.
Safari's behavior is correct. Internet Explorer / Edge behavior is correct.
Chrome and Firefox are simply buggy, they are sending invalid incorrect strings. They should be fixed so that they behave correctly and consistently. Here are the relevant bug reports for those browsers:
https://bugzilla.mozilla.org/show_bug.cgi?id=1541349
https://bugs.chromium.org/p/chromium/issues/detail?id=949056
Ideally this behavior would be specified in the spec, so that way it is easier for the browsers to coordinate their behavior.
I don't know what the wording should be, but the behavior should be something like "if an input character is outside of the BMP then the browser MUST NOT send multiple events (one event per surrogate pair), instead it MUST send a single event (which contains both surrogate pairs)".
Chrome and Firefox are simply buggy, they are sending invalid incorrect strings.
No, .charCode
is not a String
, it's unsigned long
. That makes the things complicated. For appending it to a String
object, it requires to call String.fromCharCode
instead of +=
. If it were String
, just fixing in Firefox and Chrome must have been fine and no risk.
@masayuki-nakano I don't have a Windows machine right now, so I can't test it, but how does Edge handle charCode
for non-BMP characters? When I last tested Edge version 42, Edge sent only 1 event, not 2.
I understand the web compat issues, but if Safari and Edge have already fixed the issue, then the web compat issue must not be that big of a deal, or we would have heard about it.
Chrome and Firefox are simply buggy, they are sending invalid incorrect strings.
No,
.charCode
is not aString
, it'sunsigned long
. That makes the things complicated. For appending it to aString
object, it requires to callString.fromCharCode
instead of+=
. If it wereString
, just fixing in Firefox and Chrome must have been fine and no risk.
Can you clarify what you mean? I might be misunderstanding. Turning a numeric UTF-16 code unit into a string requires String.fromCharCode
, regardless of whether astral symbols / surrogate pairs are at play.
I believe Masayuki's point is that String.fromCharCode()
takes a sequence of UTF-16 code-units, not UCS4 code-units (i.e. code-points). Changing the legacy charCode
to return Unicode code-points rather than UTF-16 code-units would mean "char code" being used inconsistently across different contexts.
With input
events there is only the data
field, conveying the input text as a text string, though with the fun caveat that (at present) it seems that some platforms (e.g. Windows) supply UTF-16 surrogate code-units in independent events, which is a little painful.
To reiterate my earlier point: Defining the desired behaviour for the modern input events (e.g. input
, keydown
) in the presence of multi code-unit input (e.g. UTF-16 surrogate pairs) seems the first place to focus. For the legacy events the right normative specification may need to differ (e.g. continuing to have keypress
delivered once per UTF-16 code-unit, for example), reflecting how things have historically worked, rather than how we'd ideally expect/want them to. :)
Re the event sequence from the Windows OSK: I don't have a device handy right now with which to verify the platform level behaviour wrt keydown
/keyup
(or rather WM_KEYDOWN
/WM_KEYUP
), but AFAIK that there are two keypress
events for each non-BMP character typed is expected.
With
input
events there is only thedata
field, conveying the input text as a text string, though with the fun caveat that (at present) it seems that some platforms (e.g. Windows) supply UTF-16 surrogate code-units in independent events, which is a little painful.
That is not a Windows platform issue, because Internet Explorer and Edge do not have that issue.
It is a deviant behavior from Chrome / Firefox only (which happens to only affect Windows).
We may be talking about different things, or different versions of Windows, then?
Using the OSK on a Windows 10 device, Edge shows the same behaviour as Chrome (which is unsurprising, since both are based on Chromium), with two keydown
/keypress
/keyup
sequences, one for each code-unit of the surrogate pair used to encode the emote.
@drwez You specifically mentioned the input
event. When I tested Edge 42, it only sent 1 event, whereas Firefox / Chrome sent 2 events.
It seems Edge 42 was before the switch to Chromium. So it is not a Windows issue, because the EdgeHTML engine did not have that issue. It is specific to the Blink / Gecko engines. And that means it is fixable, it is not an OS limitation, it is specific to particular browser engines.
That also means that for several years Edge on Windows did not have this bug, but Firefox / Chrome did have this bug. Which means websites already needed to take into account the (correct) behavior of Edge and Safari, so the compat issues should be minimal.
Ah, OK. input
is a distinct event from the keypress
event that this issue discusses; I mentioned it in comparison to the (legacy) keypress
event, since it suffers similarly from reflecting the underlying platform behaviour too closely, at present.
As you point out, though, the current behaviour of Chrome (and presumably Edge) for Unicode code-points that require surrogate pairs to express in UTF-16 is incorrect with respect to the InputEvent.data
wording in the current UI Events spec. The Windows platform conveys non-BMP characters via a pair of WM_CHAR
events, each holding one of the UTF-16 surrogate pair code-units, and Chromium is presumably just routing those directly to input
events, whereas the old Edge engine (and others) are doing some additional processing to only surface complete code-points. I've filed a bug against Chromium for that (crbug.com/1450498) for that.
It seems we are going back and forth here. Let me summarize the current situation based on my work and all the evidence currently available:
This implies Chromium-based browsers and Firefox need an intermediate layer that joins the surrogate pairs to emit a proper keyboard sequence on Windows.
@Pauan @drwez There are two perspectives here. One can blame Windows for dispatching two events or one can blame the bad browsers for not handling two events on Windows. Since it's not reasonable to expect Windows to change behavior (obviously), we can attribute the blame to the browsers.
For everyone in this issue, let's try to conclude with a solution:
I believe the spec is well-defined, in that surrogate pairs must be fully concatenated before emitting an event. Although this is not specifically stated, it's a given considering past behavior in Edge and I.E. + current behavior in Safari. WDYT?
Re:
It seems we are going back and forth here. Let me summarize the current situation based on my work and all the evidence currently available: [snip]
This description conflates the keypress
and input
events, which are two different events - one is legacy, one is explicitly specified.
Re:
I believe the spec is well-defined, in that surrogate pairs must be fully concatenated before emitting an event. Although this is not specifically stated, it's a given considering past behavior in Edge and I.E. + current behavior in Safari. WDYT?
No, that's not a given at all I'm afraid. If things were appropriately defined in the spec then this spec issue would not exist :)
The keypress
event and charCode
documentation describes legacy pre-spec events; those events have historically delivered UTF-16 code-units, and the charCode
terminology elsewhere refers to UTF-16 code-units - it's not clear that it would make sense to change that now.
The input
event is explicitly specified to return strings of characters (i.e. code points) - so it is specified differently from keypress
. Chromium (and newer Edge builds) does not implement things that way consistently - I've filed crbug.com/1450498 for the Chromium issue under Windows. Again, though, that's an implementation bug, not a spec issue.
No, that's not a given at all I'm afraid.
I think there is miscommunication. It is explicitly stated that the key code is given as the unicode code point (or 0). See https://www.w3.org/TR/uievents/#determine-keypress-keyCode. When I wrote
in that surrogate pairs must be fully concatenated before emitting an event
I meant that this specific statement wasn't stated, but this statement is a given since a broken surrogate pair is not a unicode code point.
If things were appropriately defined in the spec then this spec issue would not exist :)
The reason this issue exists is not because the spec is not well-defined, but the fact there is misalignment between implementations and the spec. Again, there are two perspectives here:
I'm stating that we should confess (1) and reenforce the current wording of the spec with specifics on surrogate point handling.
I believe Masayuki's point is that
String.fromCharCode()
takes a sequence of UTF-16 code-units, not UCS4 code-units (i.e. code-points). Changing the legacycharCode
to return Unicode code-points rather than UTF-16 code-units would mean "char code" being used inconsistently across different contexts.
Yes, that's what my point is. If .charCode
may contain non-BMP character's code point, web apps need to use String.fromCodePoint
instead. However, .charCode
exists for older web apps. Therefore, I assume that changing the meaning would break the not-maintained web apps will be broken if the use .fromCharCode
.
If web apps wants to access a (maybe) valid Unicode character, .key
is available since 6 years ago. Therefore, we should keep the legacy API behavior as-is for avoiding to break web apps in the wild.
In my understanding, .key
, .code
are intended to replace .charCode
and .keyCode
with keeping backward compatibility. Therefore, changing the legacy ones' behavior may duplicate same functional API.
And in the Firefox's case, the behavior is originated in the path to handle dead key of Windows. If you type a dead key and KeyQ
(assuming it's invalid combination), then, a punctuation corresponding to the dead key and the character for KeyQ
are sent with WM_CHAR
s for the last WM_KEYDOWN
. The non-BMP character key press works same as so (except the preceding dead key down/up sequence). Therefore, the behavior appears in historical reason. I don't know about Chrome, they might just emulate same behavior as IE and Firefox.
@masayuki-nakano I think our browser logic is somewhat similar. It should be since Firefox and Chrome output the same sequence on Windows. On macOS, Chrome is just broken w.r.t. unicode keypress
es.
Therefore, we should keep the legacy API behavior as-is for avoiding to break web apps in the wild.
Since each browser already has different behavior, all web apps using keypress
are already broken for some OS/browser pair. The point of this issue is not to modify the legacy API drastically to match some expected behavior. Since it's legacy, I suggest we move forward with the original solution I proposed: if we align with the OS behavior, the modifications we have to do on our end (Firefox and Chromium) will be minimal
For keydown
, keyup
, and input
, we obviously stick strictly to the spec; only unicode code points, no broken surrogate pairs.
Alright, so after some internal discussion with @drwez, we've designed the following solution:
keypress
SHOULD be UTF-16. This will allow Safari's behavior and allow Firefox and Chromium to maintain current behavior.keydown
, keyup
, and input
shall remain the same.@drwez @masayuki-nakano WDYT?
Focusing on the scope of this bug (i.e. just the spec, not the bugs in the various implementations), it sounds like there are one or two AIs:
Revise the wording around keypress
:
keypress
-per-surrogate model, in which case charCode
MUST hold one of the two UTF-16 surrogate code-units.keypress
, in which case charCode
MUST be set to the Unicode code-point of the generated character. We might also provide a snippet of the rudimentary logic required to cope with both behaviours.keypress
at all (given that it is a legacy event).Add explicit wording to input.data
's specification regarding whether implementations MUST, or SHOULD, or needn't, ensure to deliver non-BMP characters whole, or whether events with input.data
containing individual surrogates are acceptable.
Depending on the agreement for #2, it looks like Firefox and Chromium would then need their input
behaviour fixing.
Separately there is the question of whether keypress
events should have non-trivial values set for the modern fields we specified for use in keydown
and keyup
events (notably key
or code
), so I've filed https://github.com/w3c/uievents/issues/349 for that to be discussed.
It is explicitly stated that the key code is given as the unicode code point (or 0). See https://www.w3.org/TR/uievents/#determine-keypress-keyCode.
Note that that entire section is non-normative. We do not intend to normatively specify keypress
or the deprecated keyCode
and keyChar
attributes, although we can certainly add implementation notes.
I meant that this specific statement wasn't stated, but this statement is a given since a broken surrogate pair is not a unicode code point.
From unicode.org:
Surrogates are code points from two special ranges of Unicode values, reserved for use as the leading, and trailing values of paired code units in UTF-16.
So sending a single surrogate code point is technically valid according to the current text of the spec. Allowing a Unicode character from 2 surrogate pairs would require the spec to be re-worded.
Add explicit wording to input.data's specification regarding whether implementations MUST, or SHOULD, or needn't, ensure to deliver non-BMP characters whole, or whether events with input.data containing individual surrogates are acceptable.
The spec is actually clear on this. The data attribute is a DOMString, which usually permits unmatched surrogate pairs, but the text in the spec states it should only contain Unicode characters (so maybe the attribute should instead be defined as a USVString). Based on this, Firefox is not correct to include unmatched surrogates.
From my perspective, the primary problem here is when 2 separate event sequences are sent when the user enters a single character. I think this is unexpected and undesirable. In the Firefox example, I get the sense that the main reason for sending multiple input
(and other) events so that it can set the keyCode
attribute correctly for each surrogate half.
In the examples above, I think that Safari and Chrome are both doing appropriate things (except that Chrome is not setting the key
attribute of the keydown
/keyup
properly).
Here are my high-level thoughts on this:
beforeinput
, input
) should be sent in response to the user selecting one character.data
attribute might be better defined as a USVString
, but we are explicit in the text, so I'm not sure if this is worthwhile.To fix things, I believe the key changes (Firefox/Chrome) needed are:
data
field (to match the current spec).beforeinput
and input
events once for emoji (and other surrogates)To support these fixes, we might need minor spec updates based on how Firefox/Chromium choose to approach this. For example, we could consider any of the following:
keyChar
to allow Unicode characters (instead of just code points).keyChar
attribute might not handle surrogates properly (only having the first or last half, for example)keypress
events can happen for surrogates.Note that anything we say in the spec regarding keypress
and keyChar
will be informational (ie: non-normative). I don't have strong opinions here about these different approaches.
Thanks @garykac for your input. It definitely clarifies/reinforces some of the thoughts we discussed in this issue. I've created another issue to discuss the USVString
issue for the input
event. It also includes the composition events: https://github.com/w3c/uievents/issues/352
As you mentioned, Firefox is the only implementation that doesn't follow that issue at the moment.
Regarding keypress
, I think we still would need to stick with the AI's provided by @drwez.
(Sorry for the delay to reply, I lost notifications during catching COVID-19 in early this month.)
Based on this, Firefox is not correct to include unmatched surrogates.
Oh, yeah, it's just a bug.
Those handlers should refer KeyboardEvent.key
value instead. I'll fix it. (Oh, I realized that, we fail to set .key
value for first keypress
to a surrogate pair. I'll fix it too.)
From my experiences, if it'll be standardized, only one behavior should be defined. E.g., UI Events has non-normative explanation about keyCode
and charCode
values of keypress
, that defines 2 models, but now Firefox aligned the model from the split model to the conflated model because of compatibility with the other browsers. Therefore, defining minor browsers' behavior may just make the developers confused, and all browsers should take same behavior in any OSes if it's possible.
Note that if you enter e.g. an emote using the Windows On-Screen Keyboard then that will be expressed as two distinct keydown/keypress/keyup sequences, with the keypress part of each describing one of the two UTF-16 surrogates.
This is not the case in Web engines originating from the platform vendor. Here are screenshots from IE and EdgeHTML-based Edge running on Windows 10 2004 showing the page https://hsivonen.com/test/moz/input.html (note that the event log shows the most recent event first) with the following actions taken with focus in the input field:
IE: https://hsivonen.fi/screen/ie-ascii.png https://hsivonen.fi/screen/ie-greek.png https://hsivonen.fi/screen/ie-adlam.png https://hsivonen.fi/screen/ie-emoji.png
EdgeHTML-based Edge: https://hsivonen.fi/screen/edgehtml-ascii.png https://hsivonen.fi/screen/edgehtml-greek.png https://hsivonen.fi/screen/edgehtml-adlam.png https://hsivonen.fi/screen/edgehtml-emoji.png
Notably: In all cases:
charCode
integer is bogus for non-BMP characters.@hsivonen My description was in relation to the events received from the platform, not the way that those events are interpreted by the user agent, which I think we'd already discussed earlier as differing. :)
The keypress
behaviour shown for IE and EdgeHTML don't really make sense, since the charCode
field has a bogus value - since keypress
is a legacy event having callers expected to use the key
field to get at the real meaning, rather than simply using the standard input
event, seems unhelpful. That the two implementations differ in their choice of charCode
value suggests that the behaviours were artefacts of an implementation choice, rather than a conscious decision.
My description was in relation to the events received from the platform, not the way that those events are interpreted by the user agent
Is it known that IE and EdgeHTML use the same system API surface as Gecko and Blink? Notably, https://learn.microsoft.com/en-us/windows/win32/inputdev/wm-unichar seems to exist.
The
keypress
behaviour shown for IE and EdgeHTML don't really make sense, since thecharCode
field has a bogus value
Indeed the charCode
part doesn't make sense. However, the key
field is consistent with Safari: https://hsivonen.fi/screen/safari-adlam.png . This is a pretty strong indication that it's Web-compatible to emit one sequence of keyboard events per Unicode Scalar Value and to represent the Unicode Scalar Value as two UTF-16 code units in the key
field.
That the two implementations differ in their choice of
charCode
value suggests that the behaviours were artefacts of an implementation choice, rather than a conscious decision.
Yes, but the bogus values suggest that it's not that likely for the Web to be relying on charCode
, which means it's quite possible that it would be feasible for other engines to align to Safari's behavior, which (absent Web compat constraints to the contrary) is clearly the best behavior (no unpaired surrogates, charCode
integer shows the same scalar value as the key
string).
The Chrome Mac behavior (https://hsivonen.fi/screen/chrome-mac-adlam.png) also suggests that it should be Web-compatible to align to the Safari behavior.
From my perspective, the primary problem here is when 2 separate event sequences are sent when the user enters a single character.
I think the primary problem with splitting non-BMP characters across events is that (as far as I know) this is the only case where the environment that JS/Wasm runs in introduces unpaired surrogates. In every other case, environment-supplied DOMStrings are actually well-formed UTF-16 and the only way for a site-supplied program to get an unpaired surrogate in a string returned by a browser API is to first offer an unpaired surrogate as input to a browser API.
Therefore, these events are the only place in the platform that breaks the mappability of DOMString to the native string type of compiled-to-Wasm languages whose native string type's value space is a sequence of Unicode Scalar Values. For practical purposes today, this means Rust, but in principle it also means Swift (which, as I understand it, isn't a common compile-to-Wasm language today).
That multi-scalar-value emoji that is a single user-perceived character and a single press of a Windows 10 touch keyboard "key" gets spread across multiple events is not a problem for the perspective of mappability to Rust (or Swift) strings, since the key
field of each event is well-formed UTF-16.
Mac: https://hsivonen.fi/screen/safari-adlam.png https://hsivonen.fi/screen/firefox-mac-adlam.png https://hsivonen.fi/screen/chrome-mac-adlam.png
Firefox on Windows: https://hsivonen.fi/screen/firefox-windows-greek.png https://hsivonen.fi/screen/firefox-windows-adlam.png https://hsivonen.fi/screen/firefox-windows-emoji.png https://hsivonen.fi/screen/firefox-windows-facepalm.png
Chrome on Windows: https://hsivonen.fi/screen/chrome-windows-ascii.png https://hsivonen.fi/screen/chrome-windows-greek.png https://hsivonen.fi/screen/chrome-windows-adlam.png https://hsivonen.fi/screen/chrome-windows-emoji.png https://hsivonen.fi/screen/chrome-windows-facepalm.png
Notably, Chrome on Windows treats Adlam, which is an actual keyboard layout, as an IME even though it treats the emoji touch keyboard as a keyboard!
Considering that Chrome on Windows doesn't even appear to treat non-BMP keyboard layouts as keyboard layouts (even though IE, EdgeHTML, and Firefox treat them as keyboard layouts), I have a really hard time believing that the Web Platform couldn't converge on the combination of Safari and Windows 10 touch keyboard behaviors:
key
/data
and charCode
fields of the events in each such sequence look like they do in Safari on Mac.My description was in relation to the events received from the platform, not the way that those events are interpreted by the user agent
Is it known that IE and EdgeHTML use the same system API surface as Gecko and Blink? Notably, https://learn.microsoft.com/en-us/windows/win32/inputdev/wm-unichar seems to exist.
As far as I've tested, Emoji palette in the onscreen keyboard of Win10/11, it sends 2 sets of VK_PACKET
keydown and keyup. Translating first WM_KEYDOWN
introduces WM_CHAR
for high surrogate and second WM_KEYDOWN
introduces WM_CHAR
for low surrogate. Therefore, it seems that browsers need to wait next WM_KEYDOWN
when the first WM_KEYDOWN
is detected and stop dispatching keydown
and keyup
for first one.
One problem here is, browsers need to keep storing the last surrogate pair if .key
of keyup
needs to be set to the surrogate pair. I don't know whether there is an API to get last unicode point which was introduced by the preceding WM_KEYDOWN
, but I guess there is no such API. (Similar issue occurs for .key
of keyup
in a dead key sequence.)
The Chrome Mac behavior (https://hsivonen.fi/screen/chrome-mac-adlam.png) also suggests that it should be Web-compatible to align to the Safari behavior.
One of the problems of this approach is, only editable applications can detect text input strictly. (There is no attribute in KeyboardEvent
which let web apps know whether it inputs text or not.) So, web apps need to guess with modifier state if they handle only keydown
events in non-editable elements. (Although Firefox already takes this approach in macOS and Linux for text input coming without keyboard events. bug 1520983 and bug 1712269.)
Notably, Chrome on Windows treats Adlam, which is an actual keyboard layout, as an IME even though it treats the emoji touch keyboard as a keyboard!
How does it work if the field is not editable like readonly
mode of Keyboard Event Viewer?
And with a custom keyboard layout created with MSKLC, I see usual sequence of keyboard events in Chrome for Windows.
So, the Adlam keyboard layout could change their behavior with window class name of focused window.
One problem here is, browsers need to keep storing the last surrogate pair if .key of keyup needs to be set to the surrogate pair. I don't know whether there is an API to get last unicode point which was introduced by the preceding WM_KEYDOWN, but I guess there is no such API. (Similar issue occurs for .key of keyup in a dead key sequence.)
Browser implementations under Windows could certainly attempt to "collect" the first UTF-16 surrogate rather than propagating it, and then only emit an actual keypress
if/when the second surrogate WM_CHAR
is received - that would be a similar conceptually to the dead-key handling logic.
Is it known that IE and EdgeHTML use the same system API surface as Gecko and Blink? Notably, https://learn.microsoft.com/en-us/windows/win32/inputdev/wm-unichar seems to exist.
As per the documentation you linked, WM_UNICHAR
is provided only as a convenience for use by applications to inject Unicode character input without having to decompose it to UTF-16 code-units. While the default message-handler will decompose it into a pair of WM_CHAR
messages, for applications that don't handle it explicitly, it's not a message that the system itself ever sends.
The keypress behaviour shown for IE and EdgeHTML don't really make sense, since the charCode field has a bogus value
Indeed the charCode part doesn't make sense. However, the key field is consistent with Safari: https://hsivonen.fi/screen/safari-adlam.png . This is a pretty strong indication that it's Web-compatible to emit one sequence of keyboard events per Unicode Scalar Value and to represent the Unicode Scalar Value as two UTF-16 code units in the key field.
Sadly, not really - non-BMP keyboard input is still incredibly rare, so it seems plausible that it's not a case that folks are noticing is broken with their implementations yet.
That the two implementations differ in their choice of charCode value suggests that the behaviours were artefacts of an implementation choice, rather than a conscious decision.
Yes, but the bogus values suggest that it's not that likely for the Web to be relying on charCode, which means it's quite possible that it would be feasible for other engines to align to Safari's behavior, which (absent Web compat constraints to the contrary) is clearly the best behavior (no unpaired surrogates, charCode integer shows the same scalar value as the key string).
See above; non-BMP is still so rare that I suspect we're just not (yet) seeing folks impacted by the brokenness of charCode
in some implementations.
The Chrome Mac behavior (https://hsivonen.fi/screen/chrome-mac-adlam.png) also suggests that it should be Web-compatible to align to the Safari behavior.
Chrome Mac isn't emitting keypress
at all in that example, so I don't think it's relevant to the question?
From my perspective, the primary problem here is when 2 separate event sequences are sent when the user enters a single character.
I think the primary problem with splitting non-BMP characters across events is that (as far as I know) this is the only case where the environment that JS/Wasm runs in introduces unpaired surrogates. In every other case, environment-supplied DOMStrings are actually well-formed UTF-16 and the only way for a site-supplied program to get an unpaired surrogate in a string returned by a browser API is to first offer an unpaired surrogate as input to a browser API.
I think Gary was referring to the fact that Firefox emits two separate input
events, for the two surrogates (as does Chrome on Windows).
Firefox and Chrome Windows are consistent with historical behaviour of keypress
in this regard - the main issue that they have is that they're then continuing on to emit two distinct input
events, which goes against the spec but happens to "work", for the most part. I think we're all in agreement that the browsers should fix that. :)
Since the spec for keypress
is not specification but rather historical documentation, we're constrained, I think, to documenting the set of behaviours that content might need to content with, which currently includes:
keypress
each holding one surrogate code-unit in charCode
.keypress
holding a whole Unicode code-point in charCode
. keypress
at all.keypress
holding only the first surrogate of the pair in charCode
.
but clearly some of these behaviours are more reasonable/helpful than others. :)Notably, Chrome on Windows treats Adlam, which is an actual keyboard layout, as an IME even though it treats the emoji touch keyboard as a keyboard!
That's an interesting observation! Both behaviours seem technically valid, though the Firefox behaviour seems more useful. I wonder what the difference there is.
Considering that Chrome on Windows doesn't even appear to treat non-BMP keyboard layouts as keyboard layouts (even though IE, EdgeHTML, and Firefox treat them as keyboard layouts), I have a really hard time believing that the Web Platform couldn't converge on the combination of Safari and Windows 10 touch keyboard behaviors:
The Web Platform has converged on behaviours for keydown
, input
and keyup
(though some implementations are buggy particularly with regard to input
, as we've discussed).
keypress
is a legacy event maintained for compatibility with older sites & frameworks, though - as Gary said:
Note that that entire section is non-normative. We do not intend to normatively specify keypress or the deprecated keyCode and keyChar attributes, although we can certainly add implementation notes.
So the spec can document reasonable behaviour in the hope that new implementations will adopt it, and even that existing implementations will converge where feasible without breaking compability too much, but the situation differs from the normative specifications. As a concrete example, if Chromium were to migrate charCode
to hold the whole Unicode code-point then that will break sites that use String.fromCharCode()
to process the field; they'd need updating to use String.fromUnicodeCharacter()
to remain compatible.
@drwez Sadly, not really - non-BMP keyboard input is still incredibly rare, so it seems plausible that it's not a case that folks are noticing is broken with their implementations yet.
I don't think that's true. This issue was originally found because of emojis. Emojis have become incredibly commonplace and are used extensively by everybody.
I see people using emojis all the time on websites, e.g. YouTube comment section, Facebook, Twitter. 1.5 Billion tweets use emojis.
So I think if this was a major compat issue for Safari we would have heard about it. Just think about how popular the iPhone is, and how often people use emojis.
This bug is specifically about the legacy keypress
event, and by extension the legacy charCode
attribute on that event - to encounter the keypress
brokenness previously described the user would have to be (1) typing emoji or other non-BMP characters (2) using a browser with broken charCode
for non-BMP and (3) into a web-site that is actively processing keypress.charCode
to receive characters.
The bug you link to is with "input" events being generated incorrectly and then handled unusually (it sounds like the single-surrogate DOMStrings are being treated as a complete Unicode code-point, somewhere in rust-dominator) and appears to be on Windows, not macOS. It is the case that input
as currently defined should only ever report complete Unicode characters in data
and fixing that should be safe from a backward-compatibility perspective.
(1) typing emoji or other non-BMP characters (2) using a browser with broken charCode for non-BMP and (3) into a web-site that is actively processing keypress.charCode to receive characters.
Yes, that applies to a lot of situations. It is very common for websites to use event listeners to monitor comment textboxes. For example, Twitter monitors the textbox so it can update the "maximum characters allowed".
it sounds like the single-surrogate DOMStrings are being treated as a complete Unicode code-point, somewhere in rust-dominator
Incorrect, it is not a dominator or Rust or Wasm bug, it is 100% a browser bug. This was already well established. That bug report is what lead hsivonen to file bug reports against the browsers, which then lead to this spec bug.
I've been involved in this entire situation from the very beginning, I am well aware of what is going on.
and appears to be on Windows, not macOS
Yes, which is exactly what this bug is about: Chrome and Firefox on Windows are incorrectly generating 2 events when they should generate 1 event. Safari correctly generates 1 event.
So the concern is that if Chrome and Firefox fix their behavior, it could cause compat issues. But because Safari has always had the correct behavior, and using emojis on Safari is very popular, and Safari hasn't had any compat issues, that strongly suggests that it won't cause compat issues for Chrome / Firefox.
(1) typing emoji or other non-BMP characters (2) using a browser with broken charCode for non-BMP and (3) into a web-site that is actively processing keypress.charCode to receive characters.
Yes, that applies to a lot of situations. It is very common for websites to use event listeners to monitor comment textboxes. For example, Twitter monitors the textbox so it can update the "maximum characters allowed".
That's true, but that can (and often is) done using events & fields other than keypress
and charCode
.
it sounds like the single-surrogate DOMStrings are being treated as a complete Unicode code-point, somewhere in rust-dominator
Incorrect, it is not a dominator bug, it is 100% a browser bug. This was already well established. That bug report is what lead hsivonen to file bug reports against the browsers, which then lead to this spec bug.
Yes, I don't think there is any debate that some browsers are currently implementing input
incorrectly - resolving that for Chromium is tracked at crbug.com/1450498.
That's a separate issue from keypress
, though.
I am well aware of what is going on.
Likewise. :)
and appears to be on Windows, not macOS
Yes, which is exactly what this bug is about: Chrome and Firefox on Windows are incorrectly generating 2 events when they should generate 1 event. Safari correctly generates 1 event.
Again, this bug is specifically about the legacy keypress
event, and the charCode
field, for which the web platform spec is non-normative. Historically two keypress
events have, in the past, in various implementations, been emitted for non-BMP characters - so it can't be said that emitting only one is more (or less) correct. :)
Again, this bug is specifically about the legacy keypress event, and the charCode field, for which the web platform spec is non-normative. Historically two keypress events have, in the past, in various implementations, been emitted for non-BMP characters
Historically it hasn't always been two keypress. It depends on the browser.
The input
and keypress
bugs are connected, they're not isolated.
charCode
does complicate things a bit, but since Safari has always produced 1 event, and IE / EdgeHTML also produce 1 event, that makes things easier.
When browsers disagree, that makes it easier for the browsers to choose the correct behavior, because there is less concern about compat issues.
That has happened many times in the past, where browsers disagreed on the behavior, and so it was easy to align all of the browsers to the correct behavior.
so it can't be said that emitting only one is more (or less) correct. :)
No, the correct behavior is obviously to have 1 event. The only reason for having 2 events is for historical compat reasons.
That's why we're discussing the probability of compat issues. If the probability is low, then perhaps the browsers can just fix the bug. That has happened before.
You claim the probability of compat issues is high, because non-BMP characters are rarely used. But as I said in my earlier post, that's not true, because emojis are non-BMP and they're commonly used.
The input
and keypress
issues are related, but different:
input
is defined by this spec, and its specified behaviour already matches the one-event model that will address the bug you linked - so no spec changes are required, only fixes by the browser vendors.
keypress
has a non-normative description in the spec, and is explicitly provided for "historical compat"ibility, with content that pre-dates the input
specification.
Again, the fact that emoji are commonly-used doesn't necessarily mean that they are commonly-used in conjunction with web content that happens to also use the charCode
field of keypress
events to receive them (instead of input
/data
) - and even in the implementations with broken keypress.charCode
or input.data
the implementations do end up with non-BMP characters correctly appearing in standard text fields.
Given that the input
spec already does what you describe, and that it falls on vendors to apply fixes for that, do you object to having the existing keypress
behaviours better-described by the non-normative portion of the spec? If so then could you provide a specific alternative proposal?
@drwez do you object to having the existing keypress behaviours better-described by the non-normative portion of the spec? If so then could you provide a specific alternative proposal?
Currently the spec doesn't define the behavior of charCode
at all, and there are major inconsistencies with charCode
in the different browsers. The spec even explicitly says:
In practice, keyCode and charCode are inconsistent across platforms and even the same implementation on different operating systems or using different localizations. This specification does not define values for either keyCode or charCode, or behavior for charCode.
Although it's a legacy API, it's still commonly used, so it's still important for its behavior to be consistent among browsers.
Ideally the spec should be changed so that charCode
is more tightly specified (perhaps aligning with Safari), and that charCode
should never contain surrogate pairs.
If those spec changes cannot be made (for compat reasons), then we just have to accept that.
So the big question is: how likely are there to be compat issues if Chrome / Firefox align to Safari's behavior? That will decide what sort of spec changes (if any) need to be made.
@drwez do you object to having the existing keypress behaviours better-described by the non-normative portion of the spec? If so then could you provide a specific alternative proposal?
Currently the spec doesn't define the behavior of
charCode
at all
Are you sure you're looking at the latest draft? https://www.w3.org/TR/uievents/#dom-keyboardevent-charcode has a(n admittedly self-contradictory[1]) description of the common behaviour.
Specifically note the expectation that a charCode
is a charCode
in the DOMString
sense of the term, such that it can be passed to String.fromCharCode()
for example.
[1] Which is in part what lead to this spec bug :)
Although it's a legacy API, it's still commonly used, so it's still important for its behavior to be consistent among browsers.
While that would be ideal, what's most important is that its behaviour is consistent with what content has previously been lead/forced to accommodate. Historically the keypress
event, and the content of charCode
has not been consistent across platforms, even for the same cross-platform browser. It has been common practice for content to accommodate the various different behaviours by checking the browser, platform and/or versions in the User-Agent string.
There has been a push to use e.g. the presence or absence of fields to detect what's needed (e.g. this was the case historically with which
vs keyCode
etc) but there have also been behaviours that are harder to accommodate that way (e.g. keyCode
used Windows-style VKEYs under IE, and WebKit-based browsers on all platforms, but a different system under Firefox on Linux, IIRC).
Ideally the spec should be changed so that
charCode
is more tightly specified (perhaps aligning with Safari), and thatcharCode
should never contain surrogate pairs.If those spec changes cannot be made (for compat reasons), then we just have to accept that.
Right; the spec cannot mandate any particular behaviour, in general, since this is a legacy compatibility event.
The spec could recommend a behaviour, if an implementation is free of compatibility concerns.
So the big question is: how likely are there to be compat issues if Chrome / Firefox align to Safari's behavior? That will decide what sort of spec changes (if any) need to be made.
Right; enumerating the four main implementations we have:
Don't emit keypress
at all for non-BMP characters, only input
.
input
anyway.keypress
w/ charCode
holding the code-units, as under Chrome on Windows, and Firefox across all(?) platforms.keypress
with charCode
containing a character code-point, as is possible with Safari, IIRC.Emit two keypress
events, with charCode
set to each of the surrogates.
charCode
browsers, like Safari.input
(e.g. what happens if one or other keypress
is cancelled?).Emit a single keypress
with charCode
set to a whole character.
charCode
can be e.g. passed to String.fromCharCode()
.Emit a single keypress
with charCode
set to one of the surrogates.
charCode
is not useful, and content that uses it will not work correctly, including stuff that previously would have worked under Chrome/Windows.key
field to get the actual character data, which would be strange, since if content is going to be updated, it should be updated to use input
.To make another specific proposal, I'd suggest the spec:
[0. Continue to try to get input
fixed in the implementations!]
fromCharCode()
using content to work with both options 2 and 3.
See https://github.com/w3c/webdriver/issues/1741: browsers don’t agree on
keypress
events for keys that map to non-BMP Unicode symbols (i.e. code points beyond U+FFFF).You can reproduce this on https://w3c.github.io/uievents/tools/key-event-viewer.html using a custom keyboard layout. I’m using https://github.com/mathiasbynens/custom.keylayout/tree/main/qwerty which lets me press a key to type
𝌆
(U+1D306), which consists of the surrogate halves U+D834 U+DF06.keypress
event is emitted, withcharCode
/keyCode
/which
set to the full Unicode code point0x1D306
. (This is the behavior I’d expect as a user.)keypress
events are emitted, one for each surrogate half (0xD834
and0xDF06
).keypress
event is emitted.Screenshot showing (from top to bottom) Safari, Firefox, and Chrome: