w3c / html

Deliverables of the HTML Working Group until October 2018
https://w3c.github.io/html/
Other
1.97k stars 547 forks source link

Remove the use of printable character from accessKey attribute. #485

Closed stevenatkin closed 7 years ago

stevenatkin commented 8 years ago

5.5.2 The accessKey attribute http://www.w3.org/TR/html51/editing.html#the-accesskey-attribute

The accessKey attribute is now defined to be a single Unicode code point with it being a single printable character. We are proposing a change to the restriction limiting the accessKey to a single Unicode code point. There are examples where a single key stroke could result in multiple code points. For example, on the Hindi INSCRIPT keyboard layout on Windows the TRA key (on number 6) when pressed generates the following Unicode code point sequence: U+0924 (TA) + U+094D (Virama) + U+0930 (RA).

We suggest the following wording:

If specified, the value must consist of a string representing an available keystroke. For most languages, this will be a single printable Unicode code point.

chaals commented 8 years ago

Hi @stevenatkin,

Can you make a test case or two and say what happens in real browsers? In my existing tests with russian characters that can be generated with a single key, and japanese characters that can't, they are both apparently accepted as an accesskey but there is no mechanism to activate the latter.

There is some accesskeystuff in incubation - it's an important feature but unfortunately browsers have mostly still been trying to catch up to 1999. So in the short term I suspect the only stuff that will get into HTML is what already works - but I'll happily also work on the things that should be improved.

stevenatkin commented 8 years ago

Let me see if I can put some tests together. I will try the Hindi example in a few browsers to see what happens.

andjc commented 8 years ago

@stevenatkin @chaals With complex script input for languages like Hindi, there are multiple input strategies that have developed over time. It is quite feasible to create add an accesskey value that will work with one keyboard, but not another. Ie in Hindi ... a grapheme being output by a single key, or a consonant key automatically adding Virama.

Additionally certain input systems for some large syllabaries make extensive use of deadkeys and in some cases deadkey chaining. So actual number of printable keys is quite limited.

r12a commented 8 years ago

@chaals i think a fundamental question here is: why does the spec currently require a single character? I didn't see any explanation for that restriction from a technical point of view. Sure, it may make sense from a usability point of view in most cases, but there are other aspects to usability choices in accelerator keys that are not incorporated in the spec.

chaals commented 8 years ago

why does the spec currently require a single character?

Because unfortunately in blink, webkit and gecko don't assign a shortcut at all unless there is exactly one character in the attribute value.

There is a draft update for accesskey in incubation that is meant to provide a more generally helpful way to add shortcuts, including requesting gestures, voice commands, and named keys or combinations ("request" because it allows the user agent to provide an alternative so as not to interfere with the user's default setup). I need to get back to work on it - but comments and issues welcome as a motivator to do that.

chaals commented 7 years ago

@stevenatkin did you get to put some tests together? I don't know any Hindi, so I am not a likely candidate :(

chaals commented 7 years ago

ping @shwetank

chaals commented 7 years ago

I'd like to close this issue, either doing nothing, or getting some evidence that we need to change to match reality.

If someone (@stevenatkin @r12a ?) can show a test for e.g. Hindi that makes an accesskey work for a grapheme that is a single keypress but not a single unicode codepoint I'll happily change the spec to match reality. I simply don't know enough about Hindi to produce a test.

I wrote a test specifying the 'ñ' as an accesskey using two unicode points: accesskey="ñ", and comparing that to accesskey="ñ" and accesskey="n". The use of the combining characters fails on the 4 browsers I tested: Yandex (blink), Safari, Firefox and Edge.

Feel free to tweak the values - but please provide instructions for the keyboards to install for testing...

(This illustrates why "available key" is a concept that doesn't match reality, so should be "something the author thinks might be availabe" - and one reason why the proposal I mentioned above doesn't give more status to the attribute's value than a hint which can be ignored in favour of some other activation method).

chaals commented 7 years ago

@andjc it's more or less certain that any given character won't be available on some user's keyboard, for a variety of reasons. The most common for Latin characters is that a screenreader is already using the relevant combination - for example with VoiceOver it rules out most of the available keys. That's a known limitation of accesskey, but at least it is generally implemented in such a way that the accesskey fails to do anything rather than unexpectedly hijacking something else.

stevenatkin commented 7 years ago

Team,

Sorry for the late reply. I will contact my team in India to see if I can get an example.

On Apr 3, 2017, at 11:04 PM, chaals notifications@github.com wrote:

@andjc https://github.com/andjc it's more or less certain that any given character won't be available on some user's keyboard, for a variety of reasons. The most common for Latin characters is that a screenreader is already using the relevant combination - for example with VoiceOver it rules out most of the available keys. That's a known limitation of accesskey, but at least it is generally implemented in such a way that the accesskey fails to do anything rather than unexpectedly hijacking something else.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/html/issues/485#issuecomment-291381133, or mute the thread https://github.com/notifications/unsubscribe-auth/AEEVs8UQwawZCPpNS83xbkProUkY3gpiks5rsbM1gaJpZM4I0gOC.

steveatkin commented 7 years ago

My team in India has provided this information that may help:

Here are some links from which one can see the variety of keyboards for Hindi.

http://indiatyping.com/index.php/download/hindi-keyboard-layout

https://www.microsoft.com/resources/msdn/goglobal/keyboards/kbdinhin.html

One of these corresponds to the Inscript keyboard (similar to MS Windows Keyboard layout for Hindi in the second link) SHIFT 7, 8 have two characters KSHA and SHRA which are single keys but generate combined sequences.

The Typewriter keyboards also seem to be in use as can be seen with the Remington layout descriptions in link 1. Each single keystroke can create partially formed characters and the input method editor has to generate the appropriate sequence after the sequence of keystrokes is completed.

r12a commented 7 years ago

in case it helps

SHRA is श्र श ‎0936 DEVANAGARI LETTER SHA ् ‎094D DEVANAGARI SIGN VIRAMA र ‎0930 DEVANAGARI LETTER RA

KSHA is क्ष क ‎0915 DEVANAGARI LETTER KA ् ‎094D DEVANAGARI SIGN VIRAMA ष ‎0937 DEVANAGARI LETTER SSA

chaals commented 7 years ago

@r12a It does indeed - thank you. @steveatkin ditto :)

I should be able to make a test case, and test it. Hopefully I find time for that within a week.

chaals commented 7 years ago

Umm. Within a week was derailed by travel issues. A test case for the multiple-code-point case should be easy for me to produce, but testing is a bit too hard to set up when I am offline.

As for combining / partially-formed characters, am I understanding correctly that these can be rendered, and left uncompleted, so they could actually be typed or pasted into an accesskey attribute value as a single character?

chaals commented 7 years ago

I have a testcase for shra (श्र), ka (क) and virama by itself as accesskeys.

And I have found at least one browser that supports each of these on Mac - more results would be helpful, but I don't yet have an inscript keyboard setup.

chaals commented 7 years ago

I'd like to do some more testing on chinese/japanese characters in particular, but I think we're learning something here…

I'm thinking that the spec should say "a single printable character" as a normative requirement - because otherwise every browser except Firefox seems to ignore the accesskey altogether ;( (That strikes me as dumb behaviour, so I'll file bugs against them.)

It should then note informatively that characters which cannot be generated by a single keystroke are unlikely to work, and there is never a guarantee that anything will work since the shortcut may be overridden by e.g. a system shortcut.

And a note for implementations that there should be a means of informing the user that a shortcut is assigned, and if necessary reassigning it to something available, but reminding authors that unfotunately this isn't currently the case.

steveatkin commented 7 years ago

I agree adding an informative note would be helpful.

r12a commented 7 years ago

My understanding is that nearly all Japanese people use a latin keyboard for the starting point of the IME, and that therefore their shortcut keys are mapped to latin characters (which incidentally can make it easy for westerners to use japanese UIs even if they can't read the characters).

Chinese IMEs may use pinyin, but may also use things like changjie (shape-based) keys and bopomofo keys. I'm not sure how shortcuts are handled in those cases.

steveatkin commented 7 years ago

Here are some more thoughts from some people on my team:

All the keyboard layout definitions I am aware of, are careful NOT to assign any graphic characters to CTL+anygraphickey or ALT+anygraphickey. Only ALTGraphic (the right side ALT key)+anygraphickey is used for accessing third level and ALTGr+Shift+anygraphickey for fourth level (some MS keyboard layouts have these, even though ISO standard layouts stop at level 3).

Level 1 - is unshifted, Level 2 - is shifted and Level 3 - is AltGr-ed in ISO definitions -- per LAYER and you switch LAYERS (Alt+LeftShift/Alt+RightShift, for two layer keyboard layouts) to get an alternative group (like our bidi English and Hebrew, or English/Greek layouts etc.).

chaals commented 7 years ago

@steveatkin on Mac, alt+graphicKey is a really common pattern :(

Which is one reason why accesskeys are weird there, given that ctrl-alt-[accesskey] is the pretty much universal activation mechanism. Sadly, it is also the critical combination for VoiceOver so most accesskeys are not available to screenreader users.

chaals commented 7 years ago

Oops, there are still some issues as per https://github.com/w3c/html/commit/565bd4afbae475081164854a55d5b62599e4b24d

chaals commented 7 years ago

Tests for using repeated character references with different case for latin and greek characters suggest that browsers are inconsistent about how they handle case-sensitivity across alphabets, as well as how they handle repeated accesskey values.

See also accesskey tests I made and ran earlier ...

r12a commented 7 years ago

fwiw, there are a set of tests available at

https://www.w3.org/International/tests/repo/results/accesskey

They show wide variations in the way shortcut keys are handled across browsers.

Firefox on a Mac worked ok for basic ASCII and Devanagari, but not for a latin1 character outside ASCII, nor for greek. It also didn't work for any test that required the use of the shift key to access a particular character. It only supported case-insensitive matching for ASCII characters.

Chrome passed all the tests on the Mac, including tests for keys that produce more than one character. Use of the shift key to access a character failed on Windows 10 for ASCII and Greek, but worked for Devanagari.

Safari was like Chrome except that, like Firefox, it only supported case-insensitive matching for ASCII characters.

Edge was a very mixed bag, but one standout observation is that Devanagari wasn't supported at all (although Greek was partially supported).