Open t-tk opened 5 years ago
I suggest mapping to the code points of UniJIS-UTF16 not to U+FFFD for the following CIDs.
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx |
---|---|---|---|
15429 | 0x3c45 | 0xfffd | U+9FBD 龽 Unicode5.1(2008) |
15431 | 0x3c47 | 0xfffd | U+9FBC 龼 Unicode5.1(2008) |
15434 | 0x3c4a | 0xfffd | U+9FBE 龾 Unicode5.1(2008) |
20068 | 0x4e64 | 0xfffd | U+9FBF 龿 Unicode5.1(2008) |
20069 | 0x4e65 | 0xfffd | U+9FC0 鿀 Unicode5.1(2008) |
20070 | 0x4e66 | 0xfffd | U+9FC1 鿁 Unicode5.1(2008) |
20071 | 0x4e67 | 0xfffd | U+9FC2 鿂 Unicode5.1(2008) |
20957 | 0x51dd | 0xfffd | U+26BD ⚽ Unicode5.2(2009) |
20958 | 0x51de | 0xfffd | U+27BF ➿ Unicode6.0(2010) |
I suggest CID 12097 might map to U+26BE ⚾ instead of "野球".
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx |
---|---|---|---|
12097 | 0x2f41 | 0x91ce,0x7403 野球 | U+26BE ⚾ Unicode5.2(2009) |
I suggest mapping following CIDs to "CJK Unified Ideographs", "CJK Unified Ideographs Extension B" or "CJK Compatibility Ideographs" not to U+2E80..2EFF "CJK Radicals Supplement" according to JIS X 0213 mapping.
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx |
---|---|---|---|
13646 | 0x354e | 0x2ebd ⺽ | U+2ebd,U+26951 |
13849 | 0x3619 | 0x2ede ⻞ | U+2ede,U+2967f |
13852 | 0x361c | 0x2e97 ⺗ | U+2e97,U+38fa |
13898 | 0x364a | 0x2eca ⻊ | U+2eca,U+27fb7 |
13995 | 0x36ab | 0x2eaa ⺪ | U+2eaa,U+24d14 |
14105 | 0x3719 | 0x2e87 ⺇ | U+2e87,U+20628 |
14198 | 0x3776 | 0x2ebf ⺿ | U+2ebf,U+fa5e |
14199 | 0x3777 | 0x2ec0 ⻀ | U+2ec0,U+fa5d |
15398 | 0x3c26 | 0x2ea4 ⺤ | U+2ea4,U+fa49 |
15403 | 0x3c2b | 0x2ecc ⻌ | U+2ecc,U+fa66 |
Unihan data base at Unicode.org provides mapping between ucs and Adobe-Japan including variants. According to the variants mapping, I suggest following mapping by the Unihan data base. Especially mapping of CID21558, 21722, 21933 by Adobe-Japan1-UCS2 seem not appropriate.
(2019-05-12) CID7710,7880 are added.
CID | CID hex | Adobe-Japan1-UCS2 | Unihan data base |
---|---|---|---|
14258 | 0x37b2 | 0x975c 靜 | U+9759 静 C+2665+174.8.6 V+14258+174.8.6 |
21558 | 0x5436 | 0x65df 旟 | U+609E 悞 C+14541+61.3.7 V+21558+61.3.7 |
21722 | 0x54da | 0x69fe 槾 | U+66B5 暵 C+17755+72.4.11 V+21722+72.4.11 |
21933 | 0x55ad | 0x74d8 瓘 | U+7152 煒 C+14762+86.4.9 C+14762+178.9.4 V+21933+86.4.10 V+21933+178.10.4 |
7710 | 0x1e1e | 0x976d 靭 | U+9771 靱 C+7152+177.9.3 V+7710+177.9.3 V+13624+177.9.3 |
7880 | 0x1ec8 | 0x9771 靱 | U+976d 靭 C+2591+177.9.3 V+7880+177.9.3 V+7971+177.9.3 |
I suggest mapping CIDs to code points mapped by the latest UniJIS-UTFxx as following
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx |
---|---|---|---|
13691 | 0x357b | 0x7f36 缶 | U+26222 𦈢 ExtB Unicode3.1(2001) |
13782 | 0x35d6 | 0x5ea7 座 | U+2b777 𫝷 ExtD Unicode6.0(2010) |
14061 | 0x36ed | 0x9580 門 | U+95e8 门 Unicode1.1(1993) |
14145 | 0x3741 | 0x5be8 寨 | U+2a9e6 𪧦 ExtC Unicode5.2(2009) |
14174 | 0x375e | 0x7c14 簔 | U+2b7bd 𫞽 ExtD Unicode6.0(2010) |
14188 | 0x376c | 0x7f51 网 | U+2053f 𠔿 ExtB Unicode3.1(2001) |
14189 | 0x376d | 0x7f51 网 | U+2626a 𦉪 ExtB Unicode3.1(2001) |
14278 | 0x37c6 | 0x9f63 齣 | U+2b81a 𫠚 ExtD Unicode6.0(2010) |
20088 | 0x4e78 | 0x5354 協 | U+2b753 𫝓 ExtD Unicode6.0(2010) |
20096 | 0x4e80 | 0x56c0 囀 | U+2b75a 𫝚 ExtD Unicode6.0(2010) |
20097 | 0x4e81 | 0x56c3 囃 | U+2b75c 𫝜 ExtD Unicode6.0(2010) |
20125 | 0x4e9d | 0x64a5 撥 | U+2b77c 𫝼 ExtD Unicode6.0(2010) |
20128 | 0x4ea0 | 0x655d 敝 | U+207c8 𠟈 ExtB Unicode3.1(2001) |
20141 | 0x4ead | 0x66dc 曜 | U+2b782 𫞂 ExtD Unicode6.0(2010) |
20149 | 0x4eb5 | 0x6a05 樅 | U+2b78b 𫞋 ExtD Unicode6.0(2010) |
20153 | 0x4eb9 | 0x6bb1 殱 | U+2b794 𫞔 ExtD Unicode6.0(2010) |
20156 | 0x4ebc | 0x6dbc 涼 | U+9fcc 鿌 Unicode6.1(2012) |
20176 | 0x4ed0 | 0x75d9 痙 | U+2b7ac 𫞬 ExtD Unicode6.0(2010) |
20180 | 0x4ed4 | 0x76c8 盈 | U+2b7af 𫞯 ExtD Unicode6.0(2010) |
20194 | 0x4ee2 | 0x8077 職 | U+2b7c9 𫟉 ExtD Unicode6.0(2010) |
20204 | 0x4eec | 0x8449 葉 | U+2b7d2 𫟒 ExtD Unicode6.0(2010) |
20247 | 0x4f17 | 0x990a 養 | U+2b765 𫝥 ExtD Unicode6.0(2010) |
20256 | 0x4f20 | 0x9c75 鱵 | U+2b80d 𫠍 ExtD Unicode6.0(2010) |
20260 | 0x4f24 | 0x9e78 鹸 | U+2b817 𫠗 ExtD Unicode6.0(2010) |
I suggest CIDs mapping as follows.
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx-H | suggest |
---|---|---|---|---|
8312 | 0x2078 | 0x2193,0x2191 ↓↑ | U+21F5 ⇵ Unicode3.2(2002) | U+21F5 ⇵ |
12209 | 0x2fb1 | 0x21c5 ⇅ | U+2b83 ⮃ Unicode7.0(2014) | U+21F5 ⇵ |
I suggest following mapping to code points mapped by UniJIS-UTFxx not to combination of ASCII numbers. Because U+277F,24EB..24F4 are included in Unicode3.2 and JIS X 0213, I guess most of recent Japanese environment should support those code points.
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx |
---|---|---|---|
10514 | 0x2912 | 0x0031,0x0030 10 | U+277F ❿ Unicode1.1(1993) |
10515 | 0x2913 | 0x0031,0x0031 11 | U+24EB ⓫ Unicode3.2(2002) |
10516 | 0x2914 | 0x0031,0x0032 12 | U+24EC ⓬ Unicode3.2(2002) |
10517 | 0x2915 | 0x0031,0x0033 13 | U+24ED ⓭ Unicode3.2(2002) |
10518 | 0x2916 | 0x0031,0x0034 14 | U+24EE ⓮ Unicode3.2(2002) |
10519 | 0x2917 | 0x0031,0x0035 15 | U+24EF ⓯ Unicode3.2(2002) |
10520 | 0x2918 | 0x0031,0x0036 16 | U+24F0 ⓰ Unicode3.2(2002) |
10521 | 0x2919 | 0x0031,0x0037 17 | U+24F1 ⓱ Unicode3.2(2002) |
10522 | 0x291a | 0x0031,0x0038 18 | U+24F2 ⓲ Unicode3.2(2002) |
10523 | 0x291b | 0x0031,0x0039 19 | U+24F3 ⓳ Unicode3.2(2002) |
10524 | 0x291c | 0x0032,0x0030 20 | U+24F4 ⓴ Unicode3.2(2002) |
I suggest mapping CIDs to code points mapped by the latest UniJIS-UTFxx as following. I guess the current mapping of Adobe-Japan1-UCS2 is for a variant but the latest UniJIS-UTFxx mapping better fits the glyph.
CID | CID hex | Adobe-Japan1-UCS2 | UniJIS-UTFxx |
---|---|---|---|
13651 | 0x3553 | U+885e 衞 | U+2b7d8 𫟘 ExtD Unicode6.0(2010) |
13695 | 0x357f | U+8218 舘 | U+fa6d 舘 Unicode5.2(2009) |
13724 | 0x359c | U+2363a 𣘺 | U+2b78e 𫞎 ExtD Unicode6.0(2010) |
13740 | 0x35ac | U+6075 恵 | U+fa6b 恵 Unicode5.2(2009) |
13780 | 0x35d4 | U+4eca 今 | U+2b746 𫝆 ExtD Unicode6.0(2010) |
13866 | 0x362a | U+52e2 勢 | U+2b751 𫝑 ExtD Unicode6.0(2010) |
14064 | 0x36f0 | U+687a 桺 | U+2b789 𫞉 ExtD Unicode6.0(2010) |
14089 | 0x3709 | U+6881 梁 | U+9fc4 鿄 Unicode5.2(2009) |
14168 | 0x3758 | U+7953 祓 | U+9fc6 鿆 Unicode5.2(2009) |
14281 | 0x37c9 | U+242ee 𤋮 | U+fa6c 𤋮 Unicode5.2(2009) |
20114 | 0x4e92 | U+5ea7 座 | U+2b776 𫝶 ExtD Unicode6.0(2010) |
20201 | 0x4ee9 | U+83df 菟 | U+2b7cf 𫟏 ExtD Unicode6.0(2010) |
20240 | 0x4f10 | U+943a 鐺 | U+2b7f0 𫟰 ExtD Unicode6.0(2010) |
@t-tk Today’s CJK Type Blog article, entitled To UVS, Or Not To UVS, should help you to better understand my plans for updating the Adobe-Japan1-UCS2 ToUnicode mapping file.
The latest version of the Adobe-Japan1-UCS2 ToUnicode mapping resources addresses everything in that issue.
I suggest CID7903 and CID7904 should map to U+FF3B,U+FF3D ([]) not U+FE47, U+FE48 (﹇﹈).
I suggest mapping of CID10244..10262 circled number 32..50 (㉜..㊿) to U+325C..32BF not to ASCII numbers. Because U+325C..32BF are included in Unicode3.2 and JIS X 0213, I guess most of recent Japanese environment should support those code points.
I suggest CID394, 395 (halfwidth katakana glyphs) should map U+30F5..30F6 (ヵヶ) not to U+30AB,U+30B1 (カケ).