eggrobin / Enmerkar

π’‚—π’ˆ¨π’…•π’ƒΈ: a Sumero-Akkadian cuneiform input method for macOS and Windows.
Other
21 stars 5 forks source link

Annoying interaction with Word on Windows #3

Open eggrobin opened 1 year ago

eggrobin commented 1 year ago

On Windows, any text entered with π’‚—π’ˆ¨π’…•π’ƒΈ in Word ends up in a RTL span (this is particularly noticeable if only weakly directional characters are entered, especially ones that are Bidi_Mirrored).

More annoyingly, if the font Estrangelo Edessa is installed (which I think happens if Basic Typing is installed for Syriac), the font is switched to that when cuneiform characters are entered, yielding tofu (as it is not a cuneiform font).

A workaround for the latter is to uninstall Estrangelo Edessa; as it is a system font, this means removing the following registry value:

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Fonts]
"Estrangelo Edessa (TrueType)"="estre.ttf"

These issues are caused by pretending that the IME is for syr-SY. Pretending that it is for any other language that has a LANGID has the same problem: emitted text will switch to a font that supports the appropriate script (this is in practice far worse for other scripts than it is for Syriac: Syriac is supported by Segoe UI Historic, so that uninstalling Estrangelo Edessa does not cause tofu to appear; and uninstalling all fonts that support, e.g., the Latin script is obviously impractical).

Windows uses transient LCIDs for input methods for languages that do not have a LANGID, such as Etruscan; this is then mapped to a language tag in the registry. I have experimented in https://github.com/eggrobin/Enmerkar/commit/5dada8484564da8b7dce67ea5ed510182260be56 with using LCID 0x3000 and mapping it to akk-Xsux using the following registry keys:

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Control Panel\International\User Profile\akk-Xsux]
"3000:{F87CB858-5A61-42FF-98E4-CF3966457808}{5E81C0AA-9CC6-453C-B67D-FA70246C7EFC}"=dword:00000001
"TransientLangId"=dword:00003000
"CachedLanguageName"="Akkadian (akk-Xsux)"

[HKEY_USERS\.DEFAULT\Control Panel\International\User Profile\akk-Xsux]
"3000:{F87CB858-5A61-42FF-98E4-CF3966457808}{5E81C0AA-9CC6-453C-B67D-FA70246C7EFC}"=dword:00000001
"TransientLangId"=dword:00003000
"CachedLanguageName"="Akkadian (akk-Xsux)"

Akkadian appears as expected in the language bar, and π’‚—π’ˆ¨π’…•π’ƒΈ appears as an Akkadian IME. Unfortunately, this causes a phantom en-US layout to appear in the input method selector; switching to π’‚—π’ˆ¨π’…•π’ƒΈ switches to that layout instead. Although switching to π’‚—π’ˆ¨π’…•π’ƒΈ a second time worksβ€”but only as long as one stays in the same applicationβ€”, this ends up being far more annoying than the problem I am trying to solve.

eggrobin commented 10 months ago

The following helps:

Windows Registry Editor Version 5.00

[HKEY_CURRENT_USER\Keyboard Layout\Preload]
"5"="00003000"

[HKEY_CURRENT_USER\Keyboard Layout\Substitutes]
"00003000"="00000409"

[HKEY_CURRENT_USER\Software\Microsoft\CTF\HiddenDummyLayouts]
"00003000"="00000409"

(Where the value of 5 would need to change depending on the number of pre-existing values; the value of 3000 would need to change depending on the set of pre-existing transient LCIDs anyway.)