Open max-hk opened 5 years ago
Thanks for the issue.
However this is already tracked as #4 so this would have been better as a comment there.
However, since your comment provides more data, I'm going to close the other one and keep this one. :wink:
@bochecha Thanks
Hi @max-hk, sorry for never giving any news. This is a very interesting feature we've always wanted !
However, due to unforeseen health issues I haven't been able to give this any thought for about 3 years... :sob:
I'm trying to get back to this slowly though :smile:
What would definitely help me however would be either the data from Chromium in source form so I can make use of it.
The license they use seems to be CC-BY-SA, is that correct? If it is, I think (but I'm not a lawyer and this is not legal advice) it should be compatible with using it in ibus-cangjie, but probably only if we get them from sources instead of the binary form (so we can make some modifications and share them back with Chromium of course, as allows and requires the CC-BY-SA).
So rest assured you helped a lot with finding this and we totally want to make good use of it :grin:
It would be better if ibus-cangjie could predict the next/next few words while users are typing.
There are many free Chinese vocabulary list in the Web, licensed in CC-BY-SA or BSD. You can find them in the link below. https://chromium.googlesource.com/chromium/deps/icu46/+/e49b610806e6ba6063384ffd7f45d5b7cd561e65/source/data/brkitr/README.chromium
You can also use the pre-built by the chromium team, which combine all lists in the above link and licensed under a MIT-like LICENSE. https://chromium.googlesource.com/chromium/deps/icu46/+/e49b610806e6ba6063384ffd7f45d5b7cd561e65/source/data/brkitr/cjdict.txt ...or a updated version of the combined list by Unicode https://github.com/unicode-org/icu/blob/master/icu4c/source/data/brkitr/dictionaries/cjdict.txt
Android Pinyin IME repo also contains a vocabulary list (simplified Chinese only) https://android.googlesource.com/platform/packages/inputmethods/PinyinIME/+/refs/heads/master/jni/data/rawdict_utf16_65105_freq.txt