-
## Current State
I believe the built in tokenisation for Chinese has room for improvement. Here are two real life cases:
```r
library(quanteda)
txt
-
fake-ip模式下 断开网络一会重新连接打不开谷歌
显示 您的连接不是私密连接或者无法访问网站 然后在重新连接一下网络就可以正常访问了
请问是哪里的问题
-
Would it be possible to add cantonese support using the [CC-Canto](https://cantonese.org/download.html), including jyutping? I don't know if it'd be better to have both in the same extension, or separ…
-
Hi,
Is it possible to use your repository as a jyutping dictionary in python? If so, any instructions on how to go about this?
Many thanks,
Tim
-
In AnkiDroid, when a field is removed from a note type, all the references to that field are automatically removed from the card templates (e.g. {{MYFIELD}} is removed).
However, if I use the [cont…
-
Using the kaikki.org all-language dump dated 2021-05-08, over 2,700 senses for translations say only "Translations" instead of an actual sense/definition or a null value.
Code used to create a tabl…
-
It seems that https://github.com/CanCLID/rime-cantonese-schemes/tree/master/%E8%80%B6%E9%AD%AF%E6%8B%BC%E9%9F%B3%E6%96%B9%E6%A1%88 and https://github.com/CanCLID/rime-cantonese-schemes/tree/master/%E6…
-
I really appreciate the efforts of the project.
Typing has to be very precise to get the Chinese characters, whereas Google's implementation is more loose and can guess the words that the user is try…
-
There are multiple issues with the Cantonese data in [listss18.txt](https://github.com/clld/asjp-data/blob/master/listss18.txt), including several errors and some questionable vocabulary choices. I'm …
-
# 問題簡述
* 八股文詞表採用字形,與香港習慣不同。(例如用「爲」而非「為」)參考:[中州韻說明文件](https://github.com/rime/home/wiki/RimeWithSchemata#八股文)
* 詞表中部份變體(variant),如「靑」(下方為「円」)、户(第一劃為「丶」)、温(右上為「日」而非「囚」)未有收錄,現時做法係用 simplifier 處理地區差異(…