ibus / ibus

Intelligent Input Bus for Linux/Unix
https://github.com/ibus/ibus/wiki
GNU Lesser General Public License v2.1
877 stars 180 forks source link

ibus-anthy: Add another typing_rule #73

Closed fujiwarat closed 9 years ago

fujiwarat commented 9 years ago
Some users need a Kana typing rule.
I attached a reference patch to enable it.
Please refine it and merge it.

Original issue reported on code.google.com by utuhiro78 on 2008-09-01 07:36:58


fujiwarat commented 9 years ago
Hi, Below is table from scim-anthy. Could you tell me what's mean of second and third
column of this table? Why the second line is {"2", "", "ふ"}? Why not {"2", "ふ",""}?
What's the different?

ConvRule scim_anthy_kana_typing_rule[] = {
// no modifiers keys
{"1",   "ぬ",   ""},
{"2",   "", "ふ"},

{"3",   "あ",   ""},
{"4",   "う",   ""},
{"5",   "え",   ""},
{"6",   "お",   ""},
{"7",   "や",   ""},
{"8",   "ゆ",   ""},
{"9",   "よ",   ""},
{"0",   "わ",   ""},
{"-",   "", "ほ"},
{"^",   "", "へ"},

{"q",   "", "た"},

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-01 12:19:09

fujiwarat commented 9 years ago
I can't understand it either.
IMHO, {"2", "ふ",""} is OK.
I told it to Takuro (the author of scim-anthy).
I think he will post the answer to help us.

Original issue reported on code.google.com by utuhiro78 on 2008-09-01 14:47:47

fujiwarat commented 9 years ago
The third field means "pending" state. The character can't be determined only the 
stroke. To fix it, the program has to wait next stroke. Typically it is needed for

voiced consonant characters like "ぶ". This character is composed by two 
characters, "ふ" and "゛".
You can input it by following stroke:

  {"2", "", "ふ"} {"@", "゛", ""}

You may consider that it can be replaced by following table:

  {"2@", "ぶ", ""}

But it isn't recommended. Because the users for this typing rule expect that a "ふ"

is entered immediately when they press the "2" key.
If the engine is implemented by the longest-first method, it can't show "ふ" at this

time.

Original issue reported on code.google.com by takuro.ashie on 2008-09-02 13:46:19

fujiwarat commented 9 years ago
FYI there is another solution in the UNICODE specification. It is the "Combining 
Characters Sequence".
The characters sequence "U+3075(ふ) U+3099(゙)" means "ぶ(U+3076)" (please notice 
that normal voiced sound mark is U+309B, not U+3099).
This sequence should be handled by rendering sub system like pango instead of input

method.
The Mac OS X seems use this solution. The pango also supports it, maybe.

But application programmers may confuse in this concept. So IM engines shouldn't use

this solution, I think.

Original issue reported on code.google.com by takuro.ashie on 2008-09-02 14:40:02

fujiwarat commented 9 years ago
I understand it now. Thanks.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-02 22:48:06

fujiwarat commented 9 years ago
This feature has been implemented.
Please update code from git repository and test it.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-03 07:05:58

fujiwarat commented 9 years ago
Thank you so much. I'm testing it now.

It lacks 2 characters.
==================================================
--- ibus-anthy-0.1.1.20080903/engine/tables.py
+++ ibus-anthy-0.1.1.20080903.new/engine/tables.py
@@ -411,6 +411,8 @@
     u"B" : u"こ",
     u"M" : u"も",
     u"N" : u"み",
+    u"<" : u"、",
+    u">" : u"。",

     u"?" : u"・",
     u"_" : u"ろ",
==================================================

Japanese "yen" key and "backslash" key show the
same character ("\") in Latin mode.
But we need to show "ー" for "yen" and
"ろ" for "backslash" in Kana typing mode.

This is a famous problem,
but I don't know how to solve it in python.

Original issue reported on code.google.com by utuhiro78 on 2008-09-03 10:58:25

fujiwarat commented 9 years ago
Two characters has been added.
For problem two: I have a question?
In Latin typing mode:
  key_yen => "\"
  Key_backslash => "\"
In Kana typing mode:
  key_yen => "ー"
  Key_backslash => "ろ"
In Romaji typing mode:
  key_yen => "?"
  Key_backslash => "?"

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-03 11:16:39

fujiwarat commented 9 years ago
> In Latin typing mode:
>   key_yen => "\"
>   Key_backslash => "\"

Yes.

> In Kana typing mode:
>   key_yen => "ー"
>   Key_backslash => "ろ"

Yes.

> In Romaji typing mode:
>   key_yen => "?"
>   Key_backslash => "?"

key_yen => "¥" (wide yen)
key_backslash => "\" (wide backslash)

Original issue reported on code.google.com by utuhiro78 on 2008-09-03 11:44:12

fujiwarat commented 9 years ago
Add typing rules for backslash & Japanese yen. Please test it.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-03 15:13:06

fujiwarat commented 9 years ago
Still it doesn't detect key_yen properly.
I can't type "ー" by key_yen.
key_backslash is OK.

I'll ask it to TAM-san.

Original issue reported on code.google.com by utuhiro78 on 2008-09-03 16:15:26

fujiwarat commented 9 years ago
I attached a test program. Please use it and send the its output when you press 'yen'
& 'backslash '.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 00:11:56


fujiwarat commented 9 years ago
It needs dirty hack.
Please grep "SCIM_KEY_QuirkKanaRoMask" in scim & scim-anthy's source tree.

Original issue reported on code.google.com by takuro.ashie on 2008-09-04 00:47:34

fujiwarat commented 9 years ago
Hi Takuro.ashie & utuhiro,
How do you configure your keyboard model and layout? Could you try to use
`gnome-keyboard-properties` configure the keyboard model and layout, set keyboard
model to `Japanese 106key` and the layout to "Japan OADG 109A" . And try the attached
script, and find what keyevent we received when you press yen & backslash.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 01:36:17


fujiwarat commented 9 years ago
> Comment 12
Both show "You press backslash(92)".

> Comment 14
Both show "You press backslash(000092) 0x00000010".

TAM-san said "I haven't tested ibus yet,
but maybe keysym returns same value.
You need to check the keycode directly.
I'll test ibus later".

Original issue reported on code.google.com by utuhiro78 on 2008-09-04 04:14:10

fujiwarat commented 9 years ago
I found the article in Dairiki-san's blog:
(He is an ex-maintainer of scim-1.x.)

http://shibatama.tea-nifty.com/blog/2007/03/scim_009c.html
==========================================
2007 March 27
gtk sends keycode(= x11 keysym) and hardware_keycode(= x11 keycode).
If scim also do it, scim-engines can check key_yen and key_backslash.
==========================================

Maybe that is "SCIM_KEY_QuirkKana Ro(ろ, backslash) Mask".

Original issue reported on code.google.com by utuhiro78 on 2008-09-04 04:40:48

fujiwarat commented 9 years ago
I read the code of scim. It seems test the hardware_code of the GdkEventKey. So I
modified the test script print_keyevent.py (also print the hardware_code). Please try
it again. I want to know the the output for below keys.
Backslash:
Backslash + Shift:
Yen:
Yen + Shift:
Thanks.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 04:57:47

fujiwarat commented 9 years ago
Results:
You press backslash(000092) 0x00000010   <=== Backslash
You press underscore(000095) 0x00000011  <=== Backslash + Shift
You press backslash(000092) 0x00000010   <=== Yen
You press bar(000124) 0x00000011         <=== Yen + Shift

Original issue reported on code.google.com by utuhiro78 on 2008-09-04 05:19:12

fujiwarat commented 9 years ago
Sorry. I forgot attached modified print_keyevent.py.  It will print the hardware_code
of the key event. Please do it again. 

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 05:37:34


fujiwarat commented 9 years ago
Congratulations!

You press backslash(92,211) 0x00000010   <=== Backslash (it is converted to "ろ")
You press underscore(95,211) 0x00000011  <=== Backslash + Shift ("ろ")
You press backslash(92,133) 0x00000010   <=== Yen ("ー")
You press bar(124,133) 0x00000011        <=== Yen + Shift ("ー")

Original issue reported on code.google.com by utuhiro78 on 2008-09-04 06:07:21

fujiwarat commented 9 years ago
Sorry? What's your mean? Did you configure keyboard with `gnome-keyboard-properties`?

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 06:48:05

fujiwarat commented 9 years ago
> What's your mean?

I thought "(92,211) and (92,133) are different,
so ibus can detect the 2 keys properly".

> Did you configure keyboard with `gnome-keyboard-properties`?

OK, I switched it to "Japan OADG 109A".
(I had used normal Japanese 106-key layout.)

Results:
You press backslash(92,211) 0x00000010   <=== Backslash
You press underscore(95,211) 0x00000011  <=== Backslash + Shift
You press yen(165,133) 0x00000010        <=== Yen
You press bar(124,133) 0x00000011        <=== Yen + Shift

But uim-anthy and scim-anthy detect them properly
without any changes.

Original issue reported on code.google.com by utuhiro78 on 2008-09-04 07:25:20

fujiwarat commented 9 years ago
I think SCIM will process hardware_keycode to distinguish backslash and yen. But the
hardware code should be translate to keyval by xkb or gdk, and applications should
use keyval. Hardware keycodes are related with keyboard hardware. Different keyboards
may use different hardware keycodes for same key. And keyval is hardware
independence. So I think we should use keyval, not hardware keycode.

BTW, Are you sure uim-anthy can process [yen] key properly? I can not find that uim
processes hardware keycode in source files.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 07:58:26

fujiwarat commented 9 years ago
I said "Please grep "SCIM_KEY_QuirkKanaRoMask" in scim & scim-anthy's source tree".

You can find following code in scim:

        KeySym *keysyms = XGetKeyboardMapping (display, xkey.keycode, 1, &keysym_size);
        if (keysyms != NULL) {
            if (keysyms[0] == XK_backslash &&
                (keysym_size > 1 && keysyms[1] == XK_underscore))
                scimkey.mask |= SCIM_KEY_QuirkKanaRoMask;
            XFree (keysyms);

The scim doesn't pass any keycode to IMEngines, and doesn't translate it by itself.
To distinguish baskslash and yen, the scim obtains shift modifired mapping for the
key from the X instead. If the key + shift is mapped as "underscore", it adds a
special keymask "SCIM_KEY_QuirkKanaRoMask".

Original issue reported on code.google.com by takuro.ashie on 2008-09-04 12:50:43

fujiwarat commented 9 years ago
Hi Takuro.ashie
I checked SCIM, and tested XGetKeyboardMapping today, but I think SCIM's way to
handle 'Yen' & 'baclslash' is not right. SCIM should don't process hardware_keycode.
Because hardware_keycode dependant on keyboard hardware.
If users set right keyboard model and key layout in `gnome-keyboard-properties`
(choice layout 'Japan OADG 109A'), xkb or gdk will translate 'Yen' & 'Backslash' key
correctly. So we should not process hardware_keycode by self. Eventually, we process
hardware keycode by self, I think SCIM's way is not right too. Because KeySyms
returned by GetKeyboardMapping(dpy, 133 [hardware keycode of Yen key], 1,
&keysym_size) is not constantly, it will be changed when users change the keyboard
configure. So I think, testing keysyms[0] == XK_backslash and keysym_size > 1 &&
keysyms[1] == XK_underscore to decide if this key is Yen key is wrong.

BTW, I want to know why we do not let users to use 'Japan OADG 109A' layout. Is it
cause any other problems?

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 14:47:58

fujiwarat commented 9 years ago
Attached programs are for checking XGetKeyboardMapping. You may test it in your system.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-04 14:57:14


fujiwarat commented 9 years ago
Results:

$ ./test 
Gtk-Message: Failed to load module "canberra-gtk-module": libcanberra-gtk-module.so

$ rm test
$ make
$ ./test 
keycode = 133, size = 6, 92 124 0 0 0 0
keycode = 211, size = 6, 92 95 0 0 0 0

Original issue reported on code.google.com by utuhiro78 on 2008-09-04 16:22:18

fujiwarat commented 9 years ago
Hi Utuhiro,
What's your keyboard configure?  In my system, output is :
$ ./test 
keycode = 133, size = 6, 0 0 0 0 0 0
keycode = 211, size = 6, 0 0 0 0 0 0

> $ ./test 
> keycode = 133, size = 6, 92 124 0 0 0 0
> keycode = 211, size = 6, 92 95 0 0 0 0

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-05 01:28:18

fujiwarat commented 9 years ago
> What's your keyboard configure?

$ leafpad /etc/X11/xorg.conf
==============================================
Section "InputDevice"
    Identifier     "Keyboard1"
    Driver         "kbd"
    Option         "XkbModel" "jp106"
    Option         "XkbLayout" "jp"
    Option         "XkbOptions" "compose:rwin"
EndSection
==============================================

If you run gnome, it may use gnome's configs.

See also this movie:
http://www.geocities.jp/ep3797/snapshot/tmp/keyboard_layout.ogg
I switched jp106 to Chinese, and I got the results.
==============================================
keycode = 133, size = 6, 0 0 0 0 0 0
keycode = 211, size = 6, 0 0 0 0 0 0
==============================================

Original issue reported on code.google.com by utuhiro78 on 2008-09-05 02:15:01

fujiwarat commented 9 years ago
> BTW, Are you sure uim-anthy can process [yen] key properly? I can not find   
that uim 
> processes hardware keycode in source files. 

It is here
http://code.google.com/p/uim/source/browse/trunk/uim/uim-x-kana-input-hack.c
callers are within each gtk/qt immodule and could hide the issue inside of
immodule layer.

thanks,

Original issue reported on code.google.com by tabata.yusuke on 2008-09-05 02:25:44

fujiwarat commented 9 years ago
I know uim and scim how to handle this issue now. I could do the same thing. But I
still  has one question, why do not suggest Japanese users to use 'Japan - OADG 109A'
keyboard layout? Or why not  make 'Japan - OADG 109A' as default layout for jp106
keyboard? Does 'Japan - OADG 109A' layout cause any other problems?
I compared 'Japan - OADG 109A' with 'Japan' layout. They are same except the [yen,
bar] key. Please check attached image. The image contains those two layouts.
Below are jp layouts' defines. The only different is `key <AE13> { [ backslash, bar

 ] }' or `key <AE13> { [ yen, bar     ] };'. 

// jp106 keyboard map
partial default alphanumeric_keys
xkb_symbols "106" {
    include "jp(common)"
    name[Group1]= "Japan";

    key <AE13> { [ backslash, bar   ] };
};

// OADG109A map
partial alphanumeric_keys
xkb_symbols "OADG109A" {

    include "jp(common)"
    name[Group1]= "Japan - OADG 109A";

    key <AE13> { [ yen, bar     ] };
};

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-05 03:59:34


fujiwarat commented 9 years ago
I updated ibus. It hacks [yen bar] & [backslash underbar] keys for Japan Layout.
Currently, this feature only supports gtk & XIM applications. Qt4 will be ready later.
git commits:
http://github.com/phuang/ibus/commit/dec9b8052f3384ee2a01a82a38d947a57e8e22c9

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-06 02:08:52

fujiwarat commented 9 years ago
Thank you so much!
I can type "ー" (= yen) and "ろ" (= backslash) in Kana mode now.

ibus-anthy always uses romaji_typing_rule on startup.
Is it possible to use the last used typing_rule?

Original issue reported on code.google.com by utuhiro78 on 2008-09-06 07:00:05

fujiwarat commented 9 years ago
> I updated ibus.

Thank you for your effort.

> why do not suggest Japanese users to use 'Japan - OADG 109A'
> keyboard layout?

I think one of reason is that XLookupString() returns non-ASCII code for keysym "yen"
although most users and applications doesn't want it, especialy on terminal emulator.

Original issue reported on code.google.com by takuro.ashie on 2008-09-08 12:21:19

fujiwarat commented 9 years ago
The Qt4 is ready for [yen bar] key. Please test it.

Original issue reported on code.google.com by Shawn.P.Huang on 2008-09-16 07:21:19