DogLooksGood / emacs-rime

RIME ㄓ in Emacs
GNU General Public License v3.0
478 stars 68 forks source link

[feature request] Hook when multiple characters can be inserted with the same code (重码) #210

Open rodrigomorales1 opened 1 year ago

rodrigomorales1 commented 1 year ago

I'm sorry for writing in English. I"m currently learning Chinese, and my Chinese is not enough for fully writing this issue in Chinese.

The context

My main use of RIME in Emacs is to typeset documents with text that can't be selected in Chinese. In other words, I have to type the characters that are shown in the document, but the text in the document can't be selected so I can't copy it (I have tried using OCR, but results are not desirable, so I just keep typing the text manually).

I'm currently using rime-wubi. As most shape-based input methods, there are some characters that share the same code. For example, these three characters "去", "支" and "云" have the same Wubi 86 code: “fcu". From what I could notice in rime-wubi, a small number of characters share the same code with other character, so this situation seldom occurs.

The problem

When I'm typesetting a document, I don't pay too much attention to the candidate list that is shown by emacs-rime, because I'm familiar with Wubi so I know which keys to press for inserting any character. The advantage of doing this is that I type faster because my eyes only need to be focused on the document. The disadvantage is that it could occur that there are two characters with the same code and I don't notice that, this could cause that an incorrect character is inserted since the first candidate in the candidate list could change (its position depends in its frequency of use).

The possible solution

I would like to be informed when there are two candidates with the same code. This could be achieved by having a hook that is executed when SPC is pressed and there are two candidates that are inserted with the pressed keys so far (i.e. an additional key is not required for inserting them).

I believe this feature would benefit users of shape-based input methods that are interested in improving their typing speed, since they'll know that there are two characters with the same code. They might not be paying too much attention to the candidate list, because they are focused on typing fast. By having a sound, this situation will be made clear to them.

The workaround

I was able to do what I was looking for by adding an advice to the function rime-input-method (see code block below). However, I think by having a hook, we would allow other users to be aware of this behavior and this, ultimately, could help them to improve their typing speed.

(defun my/play-audio (path)
  (start-process "*mpv*" nil "mpv" "--no-config" path))
(defun my/notify-when-candidates-with-the-same-code (&rest r)
  (interactive)
  ;; When the pressed key is space
  (when (eq (car r) 32)
    (let* ((context (rime-lib-get-context))
           (candidates (alist-get 'candidates (alist-get 'menu context)))
           (count 0))
      ;; If the first candidate contains more than one character, then
      ;; the selection will not consume all the pressed keys so far,
      ;; so there will be remaining prompts. When this happens, all
      ;; candidates except for the first one are single characters
      ;; whose CONS CELL have a cdr, we don't want to count such
      ;; candidates.
      (when (eq (length (car (nth 0 candidates))) 1)
        (cl-loop for candidate in candidates
                 unless (cdr candidate)
                 do (setq count (1+ count)))
        (when (> count 1)
          (my/play-audio "/usr/share/sounds/freedesktop/stereo/bell.oga"))))))
(advice-add 'rime-input-method :before 'my/notify-when-candidates-with-the-same-code)
DogLooksGood commented 1 year ago

Hey, I think it's an interesting idea. Would be nice if you have a PR for this.

I never heard a Chinese propose this, even though there are a lot Wubi users. The multi-candidates issue is quite common in a Chinese input method. For people who uses shape based input method, by typing it everyday with and without Emacs, they get trained to remember those codes. So it's not really a problem I think.