Pin1yin1 / pin1yin1

Source code for the Pin1yin1.com Pinyin Converter website
MIT License
33 stars 7 forks source link

癿 missing #19

Open katpatuka opened 8 years ago

katpatuka commented 8 years ago

癿 (qié) found in zh.wiki 癿扎乡 and 癿藏镇.

fifieldt commented 8 years ago

Character is in the unicode data http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=U%2B767F

Unihan_Readings.txt:U+767F kHanyuPinyin 42643.030:qié

However, it is not in the CEDICT data, so there's no translations for it.

As a start, adding 癿扎乡 and 癿藏镇 to CEDICT:

https://cc-cedict.org/editor/editor.php

would be helpful!

katpatuka commented 8 years ago

If only I could speak and write Chinese! ;)

fifieldt commented 8 years ago

:)

Indeed! I think that even though there are no translations, the pinyin should still show. Looking into why it isn't showing up.

fifieldt commented 8 years ago

Found another qie that isn't showing up: 㚗

fifieldt commented 8 years ago

Confirmed /convert/?c=癿 is not returning anything in the pinyin array, so problem isn't with the javascript/display but with what's happening in the service,

fifieldt commented 8 years ago

Yup, so there's nothing in the database for it

mysql> SELECT definitions.* FROM definitions WHERE (characters_simplified like '癿%' or characters_traditional like '癿%') ORDER BY length(characters_simplified) desc, "primary" desc -> ; Empty set (0.00 sec)

fifieldt commented 8 years ago

OK, just re-read the data import scripts.

Basically, unless there's a definition for the character in CEDICT, no data is added to the database for a character.

In order to always display at least pinyin for all characters all the time we should change the import script to add a blank definition for characters that don't have one, using the data in unihan_readings.