無法査詢「冬」字

ksanaforge / kangxizidian

The Kangxi Dictionary

GNU General Public License v3.0

20 stars 11 forks source link

無法査詢「冬」字 #52

Closed kunki closed 10 years ago

kunki commented 10 years ago

yapcheahshen commented 10 years ago

the Author of XML uses "冬" (U+2F81A) instead of U+51AC. https://github.com/ksanaforge/kangxizidian/blob/master/xml/kx01-01-03.xml line 1739 the partial search is taken from decompose index https://github.com/ksanaforge/kangxizidian/blob/master/xml/decompose_kangxi.js line 19, that's why you can see the glyph on the left side, but not shown in right side.

kunki commented 10 years ago

多謝指敎。我之前就猜想「冬」字字頭應該完全依照康熙字典的舊字形而採用了兼容區的碼位。這種忠實古籍原貌的做法本身是十分可取的。但是這卻給檢索帶來了不便。所以以後能否給兼容區的字（主要是舊字形）向非兼容區的字做一個映射。畢竟一般的輸入法都無法快速錄入兼容區的字。特別是「冬」(U+2F81A)這種同ext-b/c/d一樣位於SIP的兼容字，幾乎沒有字型能夠支援到。

yapcheahshen commented 10 years ago

Unicode 應該有公布 SIP 的認同字，你有用過嗎？另，我是用中研院的構形資料庫，CHISE IDS 已經支援到 ext-d 的拆分了，也許會考慮加入。 http://git.chise.org/gitweb/?p=chise/ids.git;a=tree

kunki commented 10 years ago

我使用的 IDS 是 Kanji Database（http://kanji-database.sourceforge.net/）提供的。其 github 上的地址：https://github.com/cjkvi/cjkvi-ids/blob/master/ids.txt 已經支援到了 ext-e 的拆分，您可以去看看。

另，Unicode 公布的 SIP 認同字，在下不曾聽說，還請賜敎。

yapcheahshen commented 10 years ago

擴e也出了，擴充字集的辦法，永無止境啊。

http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=2F81A kCompatibilityVariant U+51AC 冬 kCompatibilityVariant 應該是在 Unihan.txt 中定義的一個 key。

kunki commented 10 years ago

我懂您意思了，兼容字與其對應的認同字，Unicode 官方文檔的確有定義。我的意見是參照這種映射關係，在作檢索的時候把兩者等同起來，這樣用「冬」（U+51AC）也能檢索到「冬」（U+2F81A）了。

yapcheahshen commented 10 years ago

加上 Unihan.txt 的compatibleVariants 對照表 https://github.com/ksanaforge/kangxizidian/blob/master/aura_components/kangxi-widget/variants.json 「冬」字可以點了。