ubports / keyboard-component

Moved to https://gitlab.com/ubports/core/lomiri-keyboard
https://gitlab.com/ubports/core/lomiri-keyboard
GNU Lesser General Public License v3.0
10 stars 36 forks source link

feature request: adding a Canjie input method for Chinese users #133

Closed twilipi closed 3 years ago

twilipi commented 4 years ago

Hi, Recently I installed UT to my old smartphone as a great alternative other than android and iOS, however there's some problem in order to type Chinese characters without too much hassle.

Although there's pinyin and chewing for Chinese-writing User, these are phonic-based input method, mainly for Mainland Chinese and Taiwanese which made it based on their Standardized Mandarin speaking language.

However for some Chinese like Hong Kongers, Malaysian, Singaporean and other oversea Chinese, they might not Mandarin-speaking but their regional Speaking languages(Hakka, Hokkien, Shanghainese, Cantonese, etc., things are quite complex in Chinese speaking culture) they'd preferred shape-based input method instead of phonic, as it doesn't restricted their way to type as long as they know the writing pattern of those characters.

For shape-based input method, Cangjie and it's sister IM Quick-Canjie(mostly bundled each other for most of the linux IM varients) is one of the mainstream, and Google, Apple also developed this method for their own OSes as well(like G-board for google, you can inspire a bit for the layout) implement this shall promote potential non-mandarin speaking Chinese get in use with it better.

here's the documentation for ibus-canjie, the iBus input version for canjie, also its background and mechanics as a reference, I don't know that algorithm can port to this keyboard easily but it should give some good starting point IMHO :) https://cangjians.github.io/projects/ibus-cangjie/documentation/

for the word database, the fcitx's canjie3/canjie5 and quick3/quick5 and rime's ibus-cangjie should help a bit, like how english's word inspiration works but in specific code string (3 and 5 are the versions of canjie, they have slight difference of handwrite "guessing" logic, and quick is a simplified version of canjie which tend to simplify the code into first and last cangjie code but required manual searching) https://github.com/fcitx/fcitx-table-extra/tree/master/tables https://github.com/rime/rime-cangjie/blob/master/cangjie5.dict.yaml

Fuseteam commented 4 years ago

fcitx may also have a usefull database we could use https://github.com/fcitx/fcitx-table-extra/blob/master/tables/cangjie5.txt

twilipi commented 4 years ago

a bit of follow-up request, mainly about phrases(combination of Chinese characters) prediction as far as I know, there're 2 methods of implementation(optional though) the first method is kinda like self-inspired dictionary, sound's like how UT's english input method works, it save some custom words once user typed and confirm the word, then suggest it once typed the similar word.

Another way is to implement a phrase dictionary, here's some examples for some open source script-based input method does, like dictionary db in other Latin-based languages but in multiple words combined. https://raw.githubusercontent.com/rime/rime-cantonese/master/jyut6ping3.phrase.dict.yaml https://github.com/rime-aca/dictionaries/blob/master/luna_pinyin.dict/luna_pinyin.extended.dict.yaml

in conclusion, phrase-finding looks familar to English's IM dictionary in this repo, but need an 1 or more extra databases and 1 extra "prediction" topology in order to search both character and phrase, not only a single character, and this is also crucial to other chinese IM methods, no matter phonic-based or shape-based (I guess some developer might already know that? nvm)

joshuatam commented 3 years ago

@twilipi I have added a simple implemention #190, see if it is a good starting point. 😄

peat-psuwit commented 3 years ago

Hello,

Thank you for contributing to UBports. As part of project renaming and the effort to port Ubuntu Touch stack to Ubuntu 20.04, we're incrementally migrating repositories to GitLab.

Your issue is now migrated to: https://gitlab.com/ubports/core/lomiri-keyboard/-/issues/133

Sorry for your inconvenience.