w3c / uievents-key

UI Events KeyboardEvents key Values
https://w3c.github.io/uievents-key/
Other
15 stars 16 forks source link

Supplementary characters in ES6 can use code point values #31

Closed r12a closed 7 years ago

r12a commented 7 years ago

https://w3c.github.io/uievents-key/#style-conventions 1.1. Stylistic Conventions

Unicode character encodings are shown as: \u003d.

https://w3c.github.io/uievents-key/#key-value-tables

  1. Keyboard Event key Value Tables, 2nd Note

There are special internationalization considerations for ECMAScript escaped characters. CharMod conformance [CharMod] expects the use of code points rather than surrogate pairs in escapes. ECMAScript escaped characters use surrogate pairs for characters outside the Basic Multilingual Plane (\uD84E\uDDC2 for "𣧂", a Chinese character meaning "untidy"), rather than C-style fixed-length characters (\U000239c2 for "𣧂") or delimited escapes such as Numeric Character References ("𣧂"). Characters escaped in this manner:

  • are based on UTF-16 encoding, in that it uses surrogate pairs for values outside the Basic Multilingual Plane
  • are expressed using surrogate pairs, which makes it difficult for a human to look up the value, and might require unnecessary overhead for machine processing — this can also cause problems with software written in the incorrect belief that Unicode is a 16-bit character set
  • are problematic for characters on supplementary planes (emoji, or Chinese characters on plane 2), some of which are expected to be input using a keyboard
  • are not be suitable for Java or C, which use different escaping mechanisms (could be solved with a normalizing method)

These are good points.

Another point would be that this annotation form ties the document to a specific implementation approach that will become redundant over time. ES6 already supports a codepoint based escape format, eg. \u{12345}

(Btw, shouldn't this explanatory text be in the main UI Events spec? Maybe i just missed it.)

asmusf commented 7 years ago

become "redundant"? Or do you mean "obsolete"? If truly mean redundant, that is, become an extraneous duplicate, then I may be missing something and would love to have you explain more.

garykac commented 7 years ago

This whole section has been reduced since we use U+ notation everywhere in the specs. This is now a note that implementation languages (like ES6, C) have different encoding schemes and the U+ value will need to be converted as approriate.