rvaiya / keyd

A key remapping daemon for linux.
MIT License
2.9k stars 171 forks source link

RFC: Improving the current layout mechanism #324

Open herrsimon opened 2 years ago

herrsimon commented 2 years ago

Over the last weeks, I've been thinking about the recently introduced layout mechanism. Even though it does not affect me personally as I don't switch layouts, it still bothers my that the current solution has significant shortcomings. I'm moving the discussion from my earlier pull request here, in the hope that others are chiming in as well.

Is a native layout mechanism really needed?

First of all, I'm sceptical that a native layout mechanism (a way to set and dynamically switch layouts) is really needed. I have the impression that people using multiple layouts only do this during some transitional period (for example when learning or trying out a new layout), finally settling on a single layout only. I live and work in a very international environment, and without exception, all people I know either use their standard local layout (with the corresponding keyboard) or “international QWERTY” (where letters are modified with accents etc. via altgr). Still my personal bubble could of course not be representative at all regarding this matter.

Nevertheless, it might be the best solution to remove all dedicated layout support again and simply have the shipped layout files populate main, shift and altgr directly. For convenience, one could offer a version with dedicated modifier layers as well (de_shift, de_altgr etc., as was done before) and add a few words to the documentation on how to manually achieve layer toggling via toggle. In the end, it would not the that cumbersome, and given that most keyd users are “power users” anyway, they would certainly not be overwhelmed by it. Of course the clear() action would then be less useful, but it could be complemented by toggle_exclusive(<layer>) (if <layer> is active, deactivate it, otherwise activate it and deactivate all other layers except main) or something similar.

Crucial functionality of a native layout mechanism

In case it is decided that keyd should have native layout support, to the end user it should feel and be documented as if a secondary keycode table is introduced (with entries allowed to contain UTF8), which is consulted in case there is no explicit binding present (in which case the standard table takes precedence, so that tab = macro(hello world) always yields “hello world”, a = z always yields “z” etc.). This secondary keycode table is a copy of the standard one, with overrides defined through the layer declarations [de:layout], [de_shift] etc. One can use fallthrough (see below) in order in order to force explicit bindings to consult the secondary table (among other things).

More explicitly, the following three crucial features should be present in my opinion:

A) Explicit user bindings should always take precendence When a user defines

[main]
a = b
alt = oneshot(alt)

then the a key should always behave as the b key (with symbols according to standard QWERTY layout), so that pressing a yields b and holding shift while pressing a yields B, irrespective of any layout. Furthermore, as QWERTY has no altgr, pressing the a key while the altgr modifier is active should yield A-b, again irrespective of any active layout (which might define a special altgr-symbol for the b key).

Likewise, pressing rightalt should always activate the alt layer, irrespective of any layout (that might in particular have an altgr layer defined).

B) The shift and altgr symbols of keys should be modified transparently Whenever the shift and/or altgr modifiers are active and a key is pressed for which no explicit user binding is present, the symbol corresponding to the layout should be output, taking into account possibly active shift and altgr modifiers. For example, with an active German (de) layout and the config

include layouts/de

[global]
default_layout = de

[main]
shift = oneshot(shift)
tab = layer(my_layer)

[my_layer:S-G]
3 = f3

[altgr+shift]
3 = f3

C) There should be a way to explicitly output the layout symbol of a key Many of keyd's actions can naturally be used in order to equip a layout-dependent key (such as a letter or number key) with additional functions. Examples are 3 = timeout(3, 400, f3) or ; = overload2(meta, ;, 200). Again taking an active German layout, the user typically wants the three in the first argument of the timeout to behave according to the layout (so that its shifted symbol is § and the symbol when the shift and altgr modifiers are applied is £). Likewise, when tapping the ; key within 200 ms, a user might want the umlaut ö instead of the symbol ; to be output. However, as soon as the German layout is deactivated, the outputs should again be according to the standard QWERTY layout.

Problem with the current layout mechanism

Unfortunately, the current layout mechanism does not offer any of the three features above and my pull request would only offer the first. For details, see the corresponding discussion.

Implementation proposal

As is already done, a layout, say de, consists of a de layer of type layout (declared by [de:layout]) and can optionally also define layers for any of the modifiers and their combinations ([de_shift], [de_altgr], [de_altgr+de_shift]). However, those optional layers are not referenced in the main layout layer, i.e. [de:layout] does not contain statements like shift = layer(de_shift). Instead, the only necessary command of this type would be rightalt = layer(altgr).

Now the three features above can be ensured by doing the following.

  1. The main layer is by default completely empty (so that a layout layers' binding of rightalt = layer(altgr) is honored even though it lies below main). Default bindings for all physical keys (alt = layer(alt), shift = layer(shift), a = a etc.) instead go to hardcoded QWERTY layer, which is always active and lies at the very bottom of the stack.

  2. All layer sets corresponding to a layout are put on top of the QWERTY layer but below the main layer, ensuring feature A. Within one set, the standard layer is at the bottom and eventual shift, altgr and shift_altgr layers are stacked on top (in this order). The layers of an active layout are then only consulted according to the currently active modifiers. Taking the de-layout again, if a shift modifier is active (i.e. contained in the modifier set of any active layer above main) and a lookup falls through main, then the de_shift and de layers are consulted in this order. If both an altgr and shift modifier is active, then de_shift_altgr, de_altgr, de_shift and de are consulted in this order and so on. This ensures feature B.

  3. A fallthrough(<key>) action is introduced, which looks for a definition of <key> further down the layer stack, to offer feature C. Then, binding ; = overload(meta, fallthrough(;), 200) would behave exactly as the user expects: Tapping the ; key inserts ö if the German layout is active but ; if no layout (hence QWERTY) is active. Also, tapping the ; key while holding shift would insert Ö or :, again depending on the active layout. If layout-independent behaviour is desired, one can continue to use ; = overload(meta, ;, 200).

Let me repeat that the end user only needs to know about fallthrough, the rest can be presented with the secondary keycode table analogy.

Flexibility of the proposal

There are several specific situations that came up while thinking about layouts. Here they are:

rvaiya commented 2 years ago

Is a native layout mechanism really needed?...

I think a lot of the problems you identify are a product of terminological differences and conceptual misapprehensions (probably my fault).

The term 'layout' in keyd is just used to describe something approximating a 'base layer' which has certain desirable properties. It doesn't necessarily need to correspond to a letter layout (though it is indeed the correct place to define such things).

For instance, it may be desirable to have one layout for programming and another for writing prose:

[programming:layout]

capslock = overload(control, esc)

[writing:layout]

capslock = backspace

[control+shift]
1 = setlayout(writing)
2 = setlayout(programming)

The user can toggle between these states without worrying about messing up the layer stack. The overlap with letter layouts is incidental (though not accidental).

Nevertheless, it might be the best solution to remove all dedicated layout support again and simply have the shipped layout files populate main, shift and altgr directly.

I still think a dedicated layout option is useful. It facilitates layout aware actions (like clear) and allows users who wish to define multiple swappable base layers the ability to do so. There is also the question of backward compatibility at this point.

In case it is decided that keyd should have native layout support, to the end user it should feel and be documented as if a secondary keycode table is introduced (with entries allowed to contain UTF8)

I'm not sure what this would entail (reading the rest of your post it seems you mean something like a 'shift table'). Keycodes are distinct from input characters, and ultimately the user has to be aware of this. Once you start wading into the realm of character input you have to contend with things like IMEs and different display server input schemes. This sort of thing is best left to higher levels of the input stack. The goal of keyd is to provide a lower level mechanism to remap individual keycodes (like QMK) which can subsequently be manipulated by higher level OS specific mechanisms.

I can see why the existence of unicode support might cause some confusion, but the mechanism by which this is achieved is documented in the man page and isn't just an implementation detail (the user is expected to consciously be aware of it).

This secondary keycode table is a copy of the standard one, with overrides

If I understand you correctly, a better term for this might be the 'shift(/altgr/symbol?) table'.

A) Explicit user bindings should always take precendence When a user defines

Agreed.

B) The shift and altgr symbols of keys should be modified transparently

This makes some sense, but ultimately it introduces too much magic. It would also technically be a breaking change, but I doubt anyone presently relies on S-<a> producing S-<compose sequence> (though may involve some surprises in the case of rightalt/alt-gr). The main reason I have avoided it thus far because I am instinctively averse to side-effects and it would likely require a giant lookup table for capitalization (though the ascii trick of flipping the 5th bit seems to work for at least some latin unicode points).

There is also the question of how you handle modifier layers (addressed below) which adds a bunch of complexity (what should S-a/C-S-a/C-a do?) and necessitates even more magic. More importantly what does C-a mean if a=ä? Applications deal in keysyms, not keycodes. What should keyd emit? The display server emits a keysym with an associated modifier mask, but keyd is using compose sequences to generate the input character.

C) There should be a way to explicitly output the layout symbol of a key...

I'm not convinced this warrants the additional complexity. Overloading letter keys is the only use case, and the frequency of the letter likely affects the user's decision to overload it. If the user really wants to overload two different letters on two different layouts, that can be trivially achieved by defining a mapping in each layout.

Unfortunately, the current layout mechanism does not offer any of the three features above...

This is by design. Layouts in keyd should be thought of more like a special kind of layer with slightly different semantics than a tool for replacing the display server's keymap.

Ultimately I think you are conflating the existence of several distinct features (unicode support, 'layout'/modifier layers) with a concerted attempt to replace display server keymaps. Each of these features serve a purpose in isolation and may be used in combination to simulate some (most?) of the effects of high level keymaps, but fundamentally they are all keycode oriented and constrained by the fact that the notion of a character is the province of the display server (which is why symbol input completely breaks in a TTY).