-
I just extracted all of the data under Chinese and noticed that many lack pronunciation/`sounds` information. In some cases, when a pronunciation is listed, the information containing the origin/langu…
-
```
北京理工大學 北京理工大学 [Bei3 jing1_Li3 Gong1_Da4 xue2] /Beijing Institute of Technology/Institute of Technology, Beijing/
```
Should it be `Bei3 jing1_Li3_Gong1_Da4 xue2`? otherwise it will be rendere…
-
**Q & A :**
我今日裝咗[rime-cantonese/releases嘅windows版本粵拼](https://github.com/rime/rime-cantonese/releases),
發現內建嘅粵拼係rime-jyutping方案,唔係rime-cantonese方案,打字有細微嘅出入。
又拖zip檔案解壓再覆蓋?
-
A thing I'm noticing is that the "senses" and "etymologies" entries are stored separately. This seems a bit weird: Wiktionary organizes senses by etymology, so should it not instead list the etymology…
-
Would you consider adding additional romanization schemes, or considering a pull request for the feature?
-
It looks like some romanization is cut off in this dataset, for example:
```
*只手仔 [dək
扭只脚 [niu
```
It should be (this was included in his dictionary):
```
*只手仔 [dək33ə33 dziak33 siu55 dɔi…
-
List of words that maybe have wrong tones, pinyins, or casings
```
阿佛洛狄忒 阿佛洛狄忒 A1 fo2 luo4 di2 te4
阿克塞縣 阿克塞县 A1 ke4 sai1_Xian4
阿里地區 阿里地区 A1 li3 de5_Qu1
阿塞拜疆 阿塞拜疆 A1 se4 bai4 jia…
-
Do you have any dictionaries with Sidney Lau or Yale romanization?
-
The standard analyzer in lucene is not exactly unicode-friendly with regards to breaking text into words, especially with respect to non-alphabetic scripts. This is because it is unaware of unicode b…
-
Obviously not a high priority for now, but this should be fairly straightforward, assuming appropriate dictionaries could be found. Latin based languages are just delimited by spaces, although it may …