[bug] input handling needs some work

Consolatis commented 6 years ago

The current way the input handling works sometimes causes keycodes (KEY_FIND) or translated escape codes ([1;6H) to be pasted as usual text inside the currently active editor. Testcase:

TERM=vt220 suplemon
press the home button

This also happens for e.g. c-pageup in tmux if enabling xterm-keys and using TERM=screen. And basically every time a terminal sends something which ncurses does not expect based on the terminfo data of the configured TERM var.

Something to play around with and some further ideas to optimize the input handling in Suplemon; ncurses utf-8 input handler testcase for python2/3 and curses.getch/get_wch: https://gist.github.com/Consolatis/f0b47166efc286bd0bcd5d02698e203a

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Edit as I modified the gist quite a bit: The basic idea is to use a single mapping table between the internal representation of a key and a function to be called. This table is only used if the input handling code thinks the given key is not targeted for the editor (ASCII / UTF-8). That table should be built automatically on startup based on a translation map of ctrl+alt+shift+home+... -> internal representation.

For example: the key m-c-v is represented to the user (in configs, help) as ctrl+alt+v. The user assigns a command ("undo") to "alt+ctrl+v" (or "ctrl+alt+v" or even "v+ ctrl +alt"). On startup (or key-config change) the mapping table is built to only include the internal representation of that key (which would be m-c-22) pointing directly to the function responsible for the command "undo".

During input handling the first check detects if it is a special (C0) key (ord(key) < 32; yes in this case). It then translates this key into the internal representation (m-c-22), looks that one up in the mapping table and calls the function responsible for "undo" and stops processing.

If it would be another key lets say "ä" or something like "T" the first check would return false. The next check would be to detect if it is usual input like UTF-8 or unicode and if yes just deliver it to the editor and stop processing.

If we are still not done the last check will do a curses.keyname() lookup and use the result as internal representation, look that one up in our mapping table and call the configured function or do nothing.

The internal representation is designed to be cheap and still flexible enough:

everything ord(key) < 32 will just be the number itself ("ctrl+v" => 22)
everything with meta(alt) modifier will be "m-" + key ("alt+shift+t" => "m-T")
everything else for which terminfo provides a name will be used ("backspace" => "KEY_BACKSPACE") If a terminfo entry does not provide modifiers itself (e.g. KEY_BACKSPACE) "alt+backspace" will be translated as "m-KEY_BACKSPACE".

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

To further elaborate on the gist, this is along the lines of how I think the internal mapping table (self.handler in the gist) could be built:

for key_config in merged(keymap.json, user-keymap.json):
    command = key_config['command']
    handler = command_handlers.get(command, None)
    if handler is None:
        log(..)
        continue
    for key in key_config['keys']:
        key = '+'.join(sorted(k.strip() for k in key.split('+')))
        key = suplemon_keys_to_internal_keys.get(key, None)
        if key is None:
            log(..)
            continue
        actual_input_mapping_table[key] = handler

Then the input processing part uses the logic as in the gist:

int(key) < 32 (TODO: check for C1 as well)
- check if key is __ESC__
- read one char with nodelay(1) (nonblocking)
  - if there was no char
  - translate key into ESC
  - if there was a char
  - check if char begins CSI or SS3 sequence and if yes
    - consume (and ignore) the whole sequence
    - stop
  - otherwise: char is (meta) escaped
    - translate key into m-[char]
- lookup key in actual_input_mapping_table and if found call handler
- stop
getch() and <= 255
- print into editor
- stop
get_wch() and key is string
- print into editor
- stop
keyname = curses.keyname(key)
- lookup keyname in actual_input_mapping_table and if found call handler
- stop

design TODO:

for getch() verify if everything between(32, 255) is indeed either printable or part of a utf-8 sequence
for get_wch() verify if everything which is string and int(key) >= 32 is indeed not a keyname

Edit 1: Terminfo entries for tmux and xterm seem to only include alt+function keys (like alt+home or alt+ctrl+home) but nothing like alt+a or alt+ctrl+a. All the terminal emulators I just tested send ESC+[key] while pressing alt+[key] which actually fits into the above flow (as it checks for ESC anyway) but it still makes me wonder why it isn't available inside terminfo (at least not inside the descriptions for the terminals I tested).

Edit 2: Handling of the meta/alt key needs further research: uxterm differs from tmux and libvte based terminals. uxterm actually sends strings for alt+[key] using get_wch() but curses.keyname() delivers M-p, M-P, M-^P and similar. That breaks the above design which is based on the assumption that get_wch() only returns strings for actual characters. Similarly uxterm sends integers <= 255 for alt+[key] using getch(), again breaking the assumption that everything <= 255 is actually printable.

Edit 3: https://invisible-island.net/xterm/ctlseqs/ctlseqs.html#h2-Alt-and-Meta-Keys Looks like uxterm is doing what is described there: it shifts the keys by adding 128. Thus, it can be argued that everything is working as intended but in this mode an application has basically no way to figure out if the alt/meta key has been pressed with a usual character or if the input is part of an UTF-8 sequence. libvte based terminals and tmux instead seem to use the other variant by sending ESC+key. That mode can be parsed inside the ESC check above so the whole design should actually work reliable.

Edit 4: More info on two newer variants to generically include modifiers with keys: https://emacs.stackexchange.com/questions/1020/problems-with-keybindings-when-using-terminal/13957#13957

From https://invisible-island.net/xterm/ctlseqs/ctlseqs.html:

      CSI > Ps; Ps m
          Set or reset resource-values used by xterm to decide whether
          to construct escape sequences holding information about the
          modifiers pressed with a given key.  The first parameter iden-
          tifies the resource to set/reset.  The second parameter is the
          value to assign to the resource.  If the second parameter is
          omitted, the resource is reset to its initial value.
            Ps = 0  -> modifyKeyboard.
            Ps = 1  -> modifyCursorKeys.
            Ps = 2  -> modifyFunctionKeys.
            Ps = 4  -> modifyOtherKeys.
          If no parameters are given, all resources are reset to their
          initial values.

        CSI ? Pm h
            [..]
            Ps = 1 0 3 6  -> Send ESC   when Meta modifies a key.  (This
          enables the metaSendsEscape resource).
            [..]

And from https://invisible-island.net/xterm/manpage/xterm.html#VT100-Widget-Resources:modifyOtherKeys (found via https://stackoverflow.com/questions/9750588/how-to-get-ctrl-shift-or-alt-with-getch-ncurses/39967143#39967143 ):

          modifyOtherKeys (class ModifyOtherKeys)
               Like modifyCursorKeys, tells xterm to construct an escape
               sequence for other keys (such as "2") when modified by
               Control-, Alt- or Meta-modifiers.  This feature does not apply
               to function keys and well-defined keys such as ESC or the
               control keys.  The default is "0":
               0    disables this feature.
               1    enables this feature for keys except for those with well-
                    known behavior, e.g., Tab, Backarrow and some special
                    control character cases, e.g., Control-Space to make a
                    NUL.
               2    enables this feature for keys including the exceptions
                    listed.

In newer (u)xterm there are two relevant resources which can be enabled by writing to the terminal:

modifyOtherKeys
- printf "\x1b[>4;1m" to enable some keys
- printf "\x1b[>4;2m" to enable all keys
- tells the terminal to send ESC[27;modifiers;codepoint~ for some keys, e.g. ctrl+shift+return sends ESC[27;6;13~.
metaSendsEscape
- printf "\x1b[?1036h" to enable
- tells the terminal to stop shifting chars on meta and instead escape them.

There might be other resources which Suplemon could set.

Edit 5: Implemented a parser for both variants of "modifyOtherKeys" and a version check (CSI > c) for everything which claims to be xterm to enable it (and metaSendsEscape), not pushed yet though.

This is actually a great resource for xterm and xterm impersonators (despite the lisp): https://git.savannah.gnu.org/cgit/emacs.git/tree/lisp/term/xterm.el Also found this condensed list: https://gist.github.com/ttdoda/4728768

Consolatis commented 6 years ago

I updated the gist and basically rewrote it from the bottom up. Literally. Start reading from the bottom. https://gist.github.com/Consolatis/f0b47166efc286bd0bcd5d02698e203a

I am somewhat satisfied with the key handling part of that gist which should be pretty solid. Next step is to split it into the debug application itself and another file which only contains the actual key handling which could then be implemented in Suplemon if @richrd agrees. The debug application could also turn out useful if it uses the exact same input handling stack as Suplemon as users could use it to help debugging input related issues on their systems.

richrd commented 6 years ago

Thanks for the extensive research! I didn't have time to go through every single point yet, but this is relevant to this issue: https://github.com/richrd/suplemon/blob/rewrite-dev/suplemon/backends/curses/curses_input.py There'll be a few things that change in the rewrite but the main thing is that Python 2 wont be supported, so we'll only use get_wch. I'll get back to this tomorrow!

richrd / suplemon

[bug] input handling needs some work #224