mike-fabian / ibus-typing-booster

ibus-typing-booster is a completion input method for faster typing
https://mike-fabian.github.io/ibus-typing-booster/
Other
232 stars 16 forks source link

[ENHANCEMENT] ibus-typing-booster should support non-ASCII input for m17n-db input methods #537

Closed mike-fabian closed 2 weeks ago

mike-fabian commented 2 months ago

Recently, m17n-db-1.8.8 has been released:

https://lists.nongnu.org/archive/html/m17n-list/2024-09/msg00003.html

This now contains one input method which can use non-ASCII characters as input, hu-rovas-post.mim, see this commit:

https://git.savannah.nongnu.org/cgit/m17n/m17n-db.git/commit/?id=9fca2ed9b4428eb00381f2266b0c1260aab189b5

for example this line:

((udiaeresis) ?𐳭)   ; U+10CED OLD HUNGARIAN SMALL LETTER RUDIMENTA UE

So one should now be able to type a ü to get 𐳭 (U+10CED OLD HUNGARIAN SMALL LETTER RUDIMENTA UE).

This works when using hu-rovas-post.mim with ibus-m17n, but it does not work when using it with ibus-typing-booster (In ibus-typingbooster, one still has to type u" to get 𐳭 (U+10CED OLD HUNGARIAN SMALL LETTER RUDIMENTA UE)).

ibus-typing-booster should be improved to support such non-ASCII input for m17n-db input methods as well.

mike-fabian commented 2 weeks ago

I have something which seems to mostly work:

diff --git a/engine/m17n_translit.py b/engine/m17n_translit.py
index 44478870..6f58c15f 100644
--- a/engine/m17n_translit.py
+++ b/engine/m17n_translit.py
@@ -27,6 +27,9 @@ from typing import Iterable
 from typing import Any
 import sys
 import ctypes
+from gi import require_version # type: ignore
+require_version('IBus', '1.0')
+from gi.repository import IBus # type: ignore

 # pylint: disable=invalid-name
 # pylint: disable=too-few-public-methods
@@ -1064,7 +1067,14 @@ class Transliterator:
         libm17n__minput_reset_ic(self._ic) # type: ignore
         committed = ''
         preedit = ''
-        for symbol in msymbol_list:
+        for index, symbol in enumerate(msymbol_list):
+            if len(symbol) == 1 and not symbol.isascii():
+                symbol = IBus.keyval_name(IBus.unicode_to_keyval(symbol))
+            elif (len(symbol) == 3 and symbol[1] == '-'
+                and symbol[0] in ('G', 'C', 'A')
+                and not symbol[2].isascii()):
+                symbol = symbol[:2] + IBus.keyval_name(
+                    IBus.unicode_to_keyval(symbol[2]))
             _symbol = libm17n__msymbol(symbol.encode('utf-8')) # type: ignore
             retval = libm17n__minput_filter( # type: ignore
                 self._ic, _symbol, ctypes.c_void_p(None))
@@ -1075,7 +1085,7 @@ class Transliterator:
                 if libm17n__mtext_len(_mt) > 0: # type: ignore
                     committed += mtext_to_string(_mt)
                 if retval:
-                    committed += symbol
+                    committed += msymbol_list[index]
         try:
             if (self._ic.contents.preedit_changed
                 and
mike-fabian commented 2 weeks ago

Things which do not yet work:

Test file to show what works and what doesn’t yet work:

;; t-test-mike.mim -- test input method

(input-method t test-mike)

(description
"Mike's test input method")

(title "Test Mike")

(map
 (trans
 ;; Lines marked with PASS work with ibus-m17n and ibus-typing-booster.
 ;; Lines marked with FAIL do not work and probably need an enhancement
 ;; in the m17n library.
  ((0x0061) "test1") ; PASS a U+0061 
  ((0x0100263A) "test3") ; FAIL ☺ U+263A WHITE SMILING FACE
  ((udiaeresis) "test40") ; PASS
  ((udiaeresis udiaeresis) "test41") ; PASS
  ;; without the line with the single adiaeresis, the following line
  ;; with the double adiaeresis still works, but when a single ä
  ;; is typed, it is not shown in the preedit, only when the second ä is typed
  ;; the final result appears. If the second ä does not come but some other
  ;; letter like x instead, the ä vanishes completely.
  ;; This is different from the behaviour of the ((x x) "test60") line.
  ;; Typing a single x puts the x into preedit, typing the second x produces
  ;; the 'test60'. If a different letter instead of the second x is typed,
  ;; the x in preedit gets commited.
  ;; I suspect a bug in the m17n library here, as this problem is reproducible
  ;; both with ibus-typing-boosters m17n_translit.py as well as with ibus-m17n.
  ;; Workaround: add ((adiaeresis) "ä"), then the single ä appears in predit
  ;; and everything seems to work fine.
  ;((adiaeresis) "ä") ; PASS
  ((adiaeresis adiaeresis) "test50") ; PASS
  ((x x) "test61") ; PASS, does not need an extra ((x) "test60") to work well
  ;; ((0x00DC) "test7") ; FAIL Ü U+00DC LATIN CAPITAL LETTER U WITH DIAERESIS
  ;; key symbols for capital letters seem to fail in ibus-m17n:
  ((Udiaeresis) "test8") ; PASS in ibus-typing-booster, FAIL in ibus-m17n. ibus-m17n bug?
  ))

(state
  (init
    (trans)))
mike-fabian commented 2 weeks ago
  • key symbols starting with a capital letter like (Udiaeresis) work in ibus-typing-booster but not in ibus-m17n. Probably a ibus-m17n bug.

This one is fixed in https://github.com/ibus/ibus-m17n/releases/tag/1.4.34
See also: https://github.com/ibus/ibus-m17n/issues/90