zmkfirmware / zmk

ZMK Firmware Repository
https://zmk.dev/
MIT License
2.65k stars 2.71k forks source link

Feature: Unicode #232

Open yannickjmt opened 3 years ago

yannickjmt commented 3 years ago

Goal: Allow input of Unicode characters directly from the keyboard. https://beta.docs.qmk.fm/using-qmk/software-features/feature_unicode for reference

As this is OS dependant, it will require a current_OS parameter that could ideally be controlled by the keyboard (&os WIN to switch to windows, &os LNX for linux, ...)

MacOS: add Unicode Hex Input in System Preferences > Keyboard > Input Sources. LAlt(Unicode_hex_code) for output. Linux: Already enabled by default. Ctrl(Shift(u+Unicode_hex_code))+space for output. Windows: Requires Wincompose program. RAlt(u+Unicode_hex_code)+enter for output.

A mapping of unicode hex codes should be available in the keymap. The second code would be the shifted key Ex:

define EAIG &uc 00E9 00C9 // é É

define NTIL &uc 00F1 00D1 // ñ Ñ

define PI &uc 03C0 // π

then add them in a layer with &kp EAIG

Nice to have: override by config OS dependant unicode output sequences. Allow output of chain of chars for stuff like (͡ ° ͜ʖ ͡ °) etc..

innovaker commented 3 years ago

Thanks @yannickjmt. We hadn't had enough time to properly spec this yet, so thank you for getting it kick started. I suggest you search the Discord history for unicode for any background discussions.

Your proposed approach is intriguing. A few initial questions if I may:

This probably needs to be considered in combination with #177 which is still open for ideas. To an extent, it probably also depends on a few core issues that are in progress (#21, #86, #153, #213).

I realise QMK already has an approach to this, I'd encourage everyone to think outside the box for ZMK's approach and consider all options first.

For completeness we also had an eye on this discussion https://github.com/kiibohd/KiiConf/issues/30.

yannickjmt commented 3 years ago

There wasn't much in depth discussion on Discord from what I saw. Yes I pretty took QMK approach and simplified it with my limited understanding...

MvEerd commented 3 years ago

fwiw; You can currently achieve this in a similar way QMK does it by using macros (PR pending), you'd have to change the 'Compose' key using a define depending on which OS you'd want to do this on See this example outputting an emoji on Linux using IBus https://github.com/MvEerd/zmk-config/commit/d58bb04218d04a353a9a7ebea80407690c4eec9e

This could perhaps be leveraged into its own behavior which takes a (string of) unicode character(s) and creates a macro with the unicode codepoint

chewxy commented 2 years ago

Hi, as someone who uses the unicode function in qmk and will be acquiring a keyboard running on zmk, I'd like to share some thoughts.

Macros are not great.

Here's why

I recently got back into APL programming. In APL for example ε is enlist and is find. The normal keyboard way is simply e e <TAB> and E E <TAB> respectively. On my current ergodox for example, I simply switch into APL layer, and type ε or press shift and get ⍷.

With Macros, you'd need to have two layers, one for the standard APL layer, and the other for the shifted APL layer. And that's fraught with errors.

My previous configuration used macros, and I designated the shift key to be a momentary layer to the shifted APL layer. But somehow that didn't work (either because of repeated changes since, or something else). The unicode layer, with a shift support was excellent and worked first try.

You need one more "OS" - emacs

emacs has its own particular way of entering unicode characters (the default keybindings are: C-x-8 RET), regardless of OS. If you try the alt method, you'd end up creating 2000+ spaces or newlines in emacs. See also: https://github.com/qmk/qmk_firmware/pull/16949/

urob commented 2 years ago

Having build-in unicode support would be fantastic, and OP's syntax suggestion for &uc unicode shifted-unicode is great!

Regarding the previous comment: it is possible to implement shifted unicode versions in a single layer using mod-morph, but doing so is quite cumbersome and not very practical when trying to implement a whole layer of, say, Greek symbols.

For example, to implement the second example from the OP, one could use the following code:

macros {
  #define OS_LEAD  &kp RALT &kp U  // OS specific sequence to initialize unicode
  #define OS_TRAIL &kp RET         // OS specific sequence to terminate unicode

  ntil: ntil {  // n-tilde
    wait-ms = <5>;
    tap-ms = <5>;
    compatible = "zmk,behavior-macro";
    label = "UC_NTIL_CAP";
    #binding-cells = <0>;
    bindings = <OS_LEAD &kp 0 &kp 0 &kp D &kp 1 OS_TRAIL>;
  }

  ntil_cap: ntil_cap {  // capital n-tilde
    wait-ms = <5>;
    tap-ms = <5>;
    compatible = "zmk,behavior-macro";
    label = "UC_NTIL";
    #binding-cells = <0>;
    bindings = <OS_LEAD &kp 0 &kp 0 &kp F &kp 1 OS_TRAIL>;
  }
}

behaviors {
  uc_ntil: uc_ntil {  // tap yields n-tilde, shift-tap yields capital n-tilde
    compatible = "zmk,behavior-mod-morph";
    label = "UC_NTIL";
    #binding-cells = <0>;
    bindings = <&ntil>, <&ntil_cap>;
    mods = <(MOD_LSFT|MOD_RSFT)>;
    masked_mods = <(MOD_LSFT|MOD_RSFT)>;  // requires PR #1114 
  };
};

keymap {
   ...
  bindings = < &uc_ntil >;
};
smeikx commented 2 years ago

On macOS you can also access several characters with simple Alt combinations, depending on input source. In my case (German) I can type (for example) with Alt + . or « with Alt + q. For MacBooks this might be a better alternative to the Unicode Hex input source because it does not require adding an additional input source.

For clarification: Choosing the Unicode Hex input source also changes the keyboard mapping.

German input source (this matches the built-in keyboard):

image

Unicode Hex input source:

image

Having multiple input sources can be a bit of an inconvenience if you have to use to the built-in keyboard occasionally.

I understand that using Alt combinations requires more than just changing the base sequence, I just wanted to throw this into the discussion.


I do think ZMK should provide unicode input for the three major OS out of the box, and for macOS it should probably rely on the Unicode Hex input source. But I also think there should be a way to at least add custom base sequences.

urob commented 2 years ago

For anyone interested, I ended up creating a workaround based on preprocessor macros that lives entirely in userspace. It's not perfect, but might help until a native solution is available.

This is the main file: https://github.com/urob/zmk-config/blob/master/config/unicode.dtsi. The file should be sourced near the top of the keymap file (see base.keymap in my configuration for an example).

EDIT: The unicode helpers are now part of my collection of helper macros, available here: https://github.com/urob/zmk-nodefree-config. See the documentation there for details on how to set them up.

Once sourced, unicode-keys can be created in two ways:

  1. single characters can be created with ZMK_UNICODE_SINGLE. For instance, to implement a π-character (03C0) one would call
     ZMK_UNICODE_SINGLE(pi,   N0, N3, C, N0)

    This will create a &pi shortcut that can be added to the keymap layout

  2. pairs of lower/upper-case characters can be created with ZMK_UNICODE_PAIR. For instance, to implement a ñ/Ñ-key (00F1 and 00D1), one would call
    ZMK_UNICODE_PAIR(n_tilde,   N0, N0, F, N1,   N0, N0, D, N1)

    This will create a &n_tilde shortcut that can be added to the keymap layout. It yields "ñ" when pressed without shift, and "Ñ" when shifted.


Final remark: Obviously, this is somewhat hacky and the syntax more cumbersome than a native solution would be. The constraint here is that the preprocessor can't detect types and can't loop. Hence the need to use arguments such as N0, N3, C, N0 instead of 03C3. A native solution would hopefully be able to avoid this by "looping" over a single four-character string and prepending numbers by N when creating the macro. Otherwise, I think, the implementation here could serve as a blueprint for a behavior. Unfortunately, I am not good enough at C to implement the behavior myself from scratch but would be happy to help out as much as I can if someone is taking on the core work.

razlani commented 1 year ago

For anyone interested, I ended up creating a workaround based on preprocessor macros that lives entirely in userspace. It's not perfect, but might help until a native solution is available.

~This is the main file: https://github.com/urob/zmk-config/blob/master/config/unicode.dtsi. The file should be sourced near the top of the keymap file (see base.keymap in my configuration for an example).~

EDIT: The unicode helpers are now part of my collection of helper macros, available here: https://github.com/urob/zmk-nodefree-config. See the documentation there for details on how to set them up.

Once sourced, unicode-keys can be created in two ways:

  1. single characters can be created with ZMK_UNICODE_SINGLE. For instance, to implement a π-character (03C0) one would call

    ZMK_UNICODE_SINGLE(pi,   N0, N3, C, N0)

    This will create a &pi shortcut that can be added to the keymap layout

  2. pairs of lower/upper-case characters can be created with ZMK_UNICODE_PAIR. For instance, to implement a ñ/Ñ-key (00F1 and 00D1), one would call

    ZMK_UNICODE_PAIR(n_tilde,   N0, N0, F, N1,   N0, N0, D, N1)

    This will create a &n_tilde shortcut that can be added to the keymap layout. It yields "ñ" when pressed without shift, and "Ñ" when shifted.

Final remark: Obviously, this is somewhat hacky and the syntax more cumbersome than a native solution would be. The constraint here is that the preprocessor can't detect types and can't loop. Hence the need to use arguments such as N0, N3, C, N0 instead of 03C3. A native solution would hopefully be able to avoid this by "looping" over a single four-character string and prepending numbers by N when creating the macro. Otherwise, I think, the implementation here could serve as a blueprint for a behavior. Unfortunately, I am not good enough at C to implement the behavior myself from scratch but would be happy to help out as much as I can if someone is taking on the core work.

Good work herein @urob - will defs be using this for spanish characters whilst a more permanent solution presents itself.

itpropro commented 4 months ago

Any updates on this? This would really help people who need international characters from their countries layout with a en-us base layout.

zapling commented 4 months ago

Any updates on this? This would really help people who need international characters from their countries layout with a en-us base layout.

I'm not using ZMK currently, but to chime in on this I think using compose-key inside your OS of choice is a better solution in most cases. I have relied on sending unicode sequences from the keyboard in the past in order to get swedish chars åäö under a US layout, but always had varying success. The result would be dependent on how the application handled the unicode input, some do it well, while others would not work at all.

The solution I have now is sending the compose key sequences for the character I want, which to me have been way more successful. For example "o would be ö,

Example on how I do it from QMK, which should get you and idea on how to port this over to your layout in ZMK https://github.com/zapling/qmk-atreus62/blob/master/zapling/keymap.c#L58

artromone commented 3 months ago

+1

gridrek commented 3 months ago

+1