Converted kangxi dictionary to text file, for use in Pleco and other dictionary apps. (re-uploaded 2020-06-29)
Slight formatting inconsistencies and error at Line 11791 (歹) #1

Wonderful project~ I'm in the process of gathering data from this database!

There's a series of commas and iteration marks. This may be an error. See below! 歹 《康熙字典》〈辰集下〉【歹字部】頁578第15 〔古文〕:𡰮、。、同、𣦵、,、俗、省、。、本、作、𣦵、,、隷、作、歺、。、>【俗書正誤】歹,音遏。【長箋】今誤讀等在切,爲好字之反。𣦵字原从卜从冂作。

There are some thick left brackets (】 ) that have a space after it. These spaces should be deleted for consistency. In fact, all spaces should be deleted. For example, all titles with spaces in between, such as 【禮 曲禮】 〈補遺 巳集〉 should be formatted with a · like so: 【禮·曲禮】 〈補遺·巳集〉

Thanks for pointing that out. The file was converted from a StarDict file using regex (original stardict file here, not my work: https://simonwiles.net/projects/kangxi-zidian/). Looking at the original file it seems the database itself was corrupted. You can view the original scanned data here: http://www.kangxizidian.com/kangxi/0578.gif. You can try alternate databases such as:

For now I've manually edited and re-uploaded the file as kangxizidian-v3f.txt, correcting the error you've pointed out. Let me know if you find any other errors! Thanks. 感謝您的幫助

I'll look into the formatting issue, keep in mind it's designed for Pleco use. I think it would be a little more difficult to add the · character by simple regex. Hope the other databases are also of use to your research!

I've also noticed that many instances of this character is an unknown character: ?

A quick notepad++ search returns 11 values (duplicates means there's 2 for that entry):

劉 | 劉 | 恖 | 恖 | 洭 | 洭 | 謧 | 贔 | 贔 | 𥒯 | 𧦮
Could you clarify what you mean by the duplicates or unknown character? I only see one result when I search for (using 2 tab characters after the character 劉). I get this result:

劉       《康熙字典》〈子集下〉【刀字部】頁144第39    〔古文〕:鎦、𠭱【唐韻】【集韻】【韻會】【正韻】𠀤力求切,音留。【說文】殺也。【書·盤庚】重我民,無盡劉。【詩·周頌】勝殷遏劉。【左傳·成十三年】䖍劉我邊陲。又【爾雅·釋詁】劉,𨻰也。【疏】謂敷𨻰也。又【爾雅·釋詁】劉,㬥樂也。【疏】木枝葉稀疎不均爲㬥樂。【詩·大雅】捋采其劉。【毛傳】劉,爆爍而希也。又【爾雅·釋木】劉,劉杙。【註】劉子生山中。【疏】劉一名劉杙,其子可食。又姓。【韻會】凡二十五望,𠀤自陶唐氏劉累之後。又【集韻】力九切,留上聲。好也。又【集韻】龍珠切,音鏤。殺也。漢禮,立秋有貙劉。又【同文備考】作??。

Also maybe try installing the fonts if you're on desktop (see the README.md file), usually for Windows you can drag them into C:\Windows\Fonts or simply double click the downloaded TTF files and press install. This might help with displaying unknown characters. If not, feel free to let me know, as it might be an error in the database.

These are the fonts: