ueno / libskk

Japanese SKK input method library
GNU General Public License v3.0
78 stars 27 forks source link

Fix crash when non-ASCII character follows an escaped characters #88

Closed lo48576 closed 2 months ago

lo48576 commented 2 months ago

Summary

libskk crashes when the dictionaries contain the entry that satisfies the condition below:

Reproducible using skk command and fcitx5-skk.

Minimal Reproducible Example

dict.skkdict:

;; -*- mode: fundamental; coding: utf-8 -*-
;; okuri-ari entries.
;; okuri-nasi entries.
あ /(concat "\050あ")/
い /(concat "\x40い")/

command:

$ echo 'A SPC' | skk --user-dict=dict.skkdict
**
ERROR:skk.c:789:string_replace: code should not be reached
Bail out! ERROR:skk.c:789:string_replace: code should not be reached
zsh: done                           echo 'A SPC' |
zsh: IOT instruction (core dumped)  skk --user-dict=dict.skkdict
$ echo 'I SPC' | skk --user-dict=dict.skkdict
**
ERROR:skk.c:789:string_replace: code should not be reached
Bail out! ERROR:skk.c:789:string_replace: code should not be reached
zsh: done                           echo 'I SPC' |
zsh: IOT instruction (core dumped)  skk --user-dict=dict.skkdict
$

Fix

https://github.com/ueno/libskk/blob/d7c3293ac6770b92f671d501cc8c3fd83eef783d/libskk/expr.vala#L85-L101

This part does index--; to move the index to the previous character (to cancel the effect of the last get_next_char() call), but this will be incorrect for non-ASCII characters. To avoid the (relatively) complex index manipulation on UTF-8 string, the patch counts the consumed (hex)digit characters and uses it to advance the index, instead of relying entirely on get_next_char() to manipulate index.

Tested in my environment and the MRE succeeded as expected.