samhocevar / wincompose

🔣 Compose Key for Windows
http://wincompose.info/
Other
2.62k stars 83 forks source link

Force english layout in Unicode input #261

Open zored opened 5 years ago

zored commented 5 years ago

Hi!

I programmed my keyboard to send specific compose combinations for Unicode symbols using QMK. I have two language layouts: russian and english. Is it possible to force WinCompose to use english layout? Because otherwise my keyboard sends senseless г1а602 instead of 😂 (which is RAlt, U1f602).

I tried to mix it up with AutoHotkey, but I don't know if it's time to reset language layout back.

Thank you for usefull tool! 👍

zored commented 5 years ago

Oh, I've just realised that I can map russian symbols in "User-defined sequences". Small duplication, but it works 😄

cubimon commented 5 years ago

I could think of a QMK based solution. I guess you have a russian and english layer on your keyboard and you also have english and russian language settings in window. You could try to find russian hex symbols, which result in the same unicode hex point 😕? So your keyboard could send them depending on your layer.

samhocevar commented 5 years ago

I can add automatic rules for Russian keyboards so it works out of the box. Can you confirm that typing aAbBcCdDeEfFuU on a Russian keyboard yields фФиИсСвВуУаАгГ?

zored commented 5 years ago

@samhocevar thank you! Yes, I got the same: фФиИсСвВуУаАгГ. 😊

@cubimon sounds interesting! Currently I use OS language sources But if I will create separate layer with russian letters (based on WinCompose input 😊) then it may turn into even more agile solution. Just a question of time for initial configuration.

samhocevar commented 5 years ago

@zored thanks! Now I realise that on a Belarusian keyboard the sequence will be фФіІсСвВуУаАгГ and on a French keyboard it will be qQbBcCdDeEfFuU, so it seems pretty risky to rely on letters. I think I will propose a patch for QMK that only uses digits 0-9 instead.

zored commented 5 years ago

@samhocevar sounds great! I will try to hack it too. Here is starting point. What is WinCompose combination for digit 0-9 input?

samhocevar commented 5 years ago

@zored the problem is that QMK’s send_unicode_hex_string will have to be replaced and QMK will have to use decimal numbers instead. I will also have to modify WinCompose heavily. I’ll keep you informed.

zored commented 5 years ago

@samhocevar thanks! Is there combo in WinCompose for integer Unicode input now? I will hack it in my implementation :)

samhocevar commented 5 years ago

@zored not yet; if you are in a hurry you can modify function GetGenericSequenceResult in Settings.cs and replace 16 with 10 to force it to use decimal!

cubimon commented 5 years ago

I don't understand the issue/why decimal is required. In case decimal is required, wouldn't it be possible to use the hex keyboard input and convert it to decimal? So what function breaks the symbols?

samhocevar commented 5 years ago

@cubimon it’s just that QMK has no control over the keyboard layout the user is currently using, so it may send u1ab3 but WinCompose could receive г1Фи3 instead. One idea to work around this was to only use digits, but the other, more robust, is to put WinCompose in a special mode where it only cares about the VirtualKey value, not the actual character.

cubimon commented 5 years ago

Doesn't qmk send ascii characters 'a' to 'f' or '0' to '9'? And these should be changed to an equivalent unicode glyph? I guess this is mor a latin1 or iso8859-1 issue (or whatever windows is using). Probably we have to convert the unicode hex to the other encoding?

samhocevar commented 5 years ago

(Edit: this comment was confusing scan codes and virtual keys, I rewrote it properly)

@cubimon QMK doesn’t know about ASCII, just HID keycodes. The Windows keyboard driver then transforms these keycodes into scancodes, and finally the Windows keyboard layout transforms these scancodes into virtual keys, which are mapped to actual characters.

So QMK cannot send ASCII character D, it can only send HID keycode KEY_D, which will be translated to scan code SC 0x20. Then this will be mapped to:

The problem: when WinCompose receives scan code 0x20, it may be QMK sending the letter D. But it could also be a Dvorak keyboard user pressing the letter E. There is no way to know for sure, which is why I would like a different method for QMK input that does not rely on the current keyboard layout.

cubimon commented 5 years ago

Then in unicode input mode the language dependent character representation of a virtual key/hid keycode shouldn't be considered? The key sequence (virtual keys) should be converted to hex and back to a new series of virtual keys, that represent the unicode character? Or how is the unicode printed? I guess Composer.SendString is used for that/Ctrl+Shift+u?

samhocevar commented 5 years ago

(Edit: this comment was confusing scan codes and virtual keys, I rewrote it properly)

@cubimon it’s not a problem with what WinCompose prints, it’s a problem with what it receives from QMK or from the user.

On a Dvorak keyboard, if WinCompose receives scancode 0x20 from the user, it needs to interpret it as character E. But if QMK sends scancode 0x20 as part of a hex sequence, it needs to interpret it as character D! WinCompose has no way to know where the keypresses come from, they both come from the keyboard, so the current UC_WINC mode in QMK is broken on non-US layouts.

cubimon commented 5 years ago

Now you start to confuse me more 😕. I guess you mean VK_D and D instead of VK_A and A, like in your comments above. On a dvorak keyboard the key at dvorak's layout position E should also send VK_E, if it sent VK_A it wouldn't be a dvorak keyboard, but rather a qwert? I guess wincompose doesn't see the read keycodes from the keyboard, but what microsoft maps by some driver? That would be my only explanation, but even more confusing.

samhocevar commented 5 years ago

@cubimon my bad, I confused virtual keys and scancodes above. Let me rephrase it once more!

The scancodes do not depend on the keyboard layout and are based on standard QWERTY keyboard (see this stackoverflow answer for more details). So when the USB keyboard sends KEY_D, Windows will always receive scancode 0x20. It will then be mapped to VK_D or VK_E (or something else) depending on the currently active layout.

This means that QMK can never reliably send the letter D to an application. It can just emit the key D, which may be mapped to just about anything. And when WinCompose receives this key, it can’t differenciate between QMK sending KEY_D or a Dvorak user pressing their E key, because they’re physically the same.

Does this make more sense?

cubimon commented 5 years ago

Yes, thank you for the explanation, I think I understand the problem now. So numbers are more stable when it comes to keyboard layouts and it may be the easier solution to make qmk print decimal unicode characters. Alternatively a lower level solution/a keyboard device driver or something like that could be a solution, but I guess windows drivers are sick to write or share with others because license/signing.

zored commented 5 years ago

Thank you for deep explanation 🤔

I see this as a general issue for international users: you have to switch layouts to type any english composition with WinCompose. It is easy to test: add another language layout in Windows and give it a try 😺

As a programmer, I think that creating mappings for symbols is kinda problematic. So I suggested to force english keyboard layout on Right Alt (or any other control button), but I do not know if it is possible 🤔

cubimon commented 5 years ago

Here is a solution based on your suggestion @samhocevar.

put WinCompose in a special mode where it only cares about the VirtualKey value, not the actual character

This works without using decimals. I added a constructor to save the virtual key in addition to the string representation in a key. In the unicode method (Composer.AddToSequence) I use the virtual key to recalculate the string representation. Maybe ToLower() should be added on the recalculated ascii key?

diff --git a/src/composer/Composer.cs b/src/composer/Composer.cs
index 7d94da3..5f6270a 100644
--- a/src/composer/Composer.cs
+++ b/src/composer/Composer.cs
@@ -429,6 +429,14 @@ static class Composer
     /// </summary>
     private static bool AddToSequence(Key key)
     {
+        int vk = (int) key.VirtualKey;
+        if ((vk >= 48 && vk <= 57) ||
+                (vk >= 65 && vk <= 90))
+        {
+            string newSymbol = "";
+            newSymbol += (char)vk;
+            key = new Key(key.VirtualKey, newSymbol);
+        }
         KeySequence old_sequence = new KeySequence(m_sequence);
         m_sequence.Add(key);

diff --git a/src/composer/KeyboardLayout.cs b/src/composer/KeyboardLayout.cs
index 0017d80..9c7c024 100644
--- a/src/composer/KeyboardLayout.cs
+++ b/src/composer/KeyboardLayout.cs
@@ -204,14 +204,14 @@ public static class KeyboardLayout
         string str_if_dead = VkToUnicode(VK.SPACE);

         if (str_if_dead != " ")
-            return new Key(str_if_dead);
+            return new Key(vk, str_if_dead);

         // Special case: we don't consider characters such as Esc as printable
         // otherwise they are not properly serialised in the config file.
         if (str_if_normal == "" || str_if_normal[0] < ' ')
             return new Key(vk);

-        return new Key(str_if_normal);
+        return new Key(vk, str_if_normal);
     }

     private static string VkToUnicode(VK vk)
diff --git a/src/sequences/Key.cs b/src/sequences/Key.cs
index 6f425a3..2fc1acb 100644
--- a/src/sequences/Key.cs
+++ b/src/sequences/Key.cs
@@ -177,6 +177,8 @@ public partial class Key

     public Key(VK vk) { m_vk = vk; }

+    public Key(VK vk, string str) { m_vk = vk; m_str = str; }
+
     public VK VirtualKey => m_vk;
     public bool IsPrintable => m_str != null;

I tested with windows russian burjat language and it worked fine. grafik

gagarski commented 3 years ago

(first of all, a friendly reminder that the issue still exists)

But actually, I am OK with @zored's workaround (I need only 6 symbols) but I gave up implementing it. I am trying to du the following in .XCompose file:

# Inspired by this sequence, which is working fine
# <Multi_key> <Б> <Б> : "«"

<Multi_key> <г> <2> <0> <1> <4> <???> : "—"

The thing is that wincompose expects Compose-u-2-0-1-4-Enter sequence to enter em-dash and I don't know what to type in events list in .XCompose file instead of ??? so it handles Enter key (I tried "Enter", "\n", literal line break, Linefeed, Control_j, nothing of that works).

There is actually another workaround on QMK side: send something like Compose-dash-dash-dash using SEND_STRING macro but this is kinda messy and would involve some logic for OS switching (besides QMK native logic) and I don't like it (though it would be working with original X11 XCompose without changes).

gagarski commented 3 years ago

So I ended up working this around without QMK unicode capabilities:

void tap_dance_mdash(uint16_t tap) {
    SEND_STRING(SS_TAP(X_APP)"---");
}

bool ndash(uint16_t keycode, const keyrecord_t* record) {
    if (record->event.pressed) {
        SEND_STRING(SS_TAP(X_APP)"--.");
    }
    return false;
}

bool laquo(uint16_t keycode, const keyrecord_t* record) {
    if (record->event.pressed) {
        SEND_STRING(SS_TAP(X_APP)"<<");
    }
    return false;
}

bool raquo(uint16_t keycode, const keyrecord_t* record) {
    if (record->event.pressed) {
        SEND_STRING(SS_TAP(X_APP)">>");
    }
    return false;
}

bool ldquo(uint16_t keycode, const keyrecord_t* record) {
    if (record->event.pressed) {
        SEND_STRING(SS_TAP(X_APP)",\"");
    }
    return false;
}

bool rdquo(uint16_t keycode, const keyrecord_t* record) {
    if (record->event.pressed) {
        SEND_STRING(SS_TAP(X_APP)"<\"");
    }
    return false;
}

And on WinCompose side:

<Multi_key> <Б> <Б> : "«"
<Multi_key> <Ю> <Ю> : "»"
<Multi_key> <б> <Э> : "„"
<Multi_key> <Э> <б> : "„"
<Multi_key> <Б> <Э> : "“"
<Multi_key> <Э> <Б> : "“"
<Multi_key> <minus> <minus> <ю> : "–"

Good thing is that now I can use it on Linux without switching modes. Bad thing is adding more symbols would require more boilerplate code.

frost555 commented 1 year ago

Stumbled in the same problem with extra Enter key as @gagarski mentioned above. Looks like there is a way to remove that behaviour in QMK by setting empy override for the following function

void unicode_input_finish(void) {
}

in keymap.c

see https://github.com/qmk/qmk_firmware/blob/master/docs/feature_unicode.md#start-and-finish-input-functions for details.

bogorad commented 6 months ago

Here's how I solved this issue. https://bogorad.medium.com/qmk-and-cyrillic-2323f1a61fa0