adobe-type-tools / mapping-resources-pdf

Mapping Resources for PDF
BSD 3-Clause "New" or "Revised" License
28 stars 10 forks source link

[Adobe-Japan1-UCS2] Suggested changes #6

Open t-tk opened 5 years ago

t-tk commented 5 years ago

I suggest CID7903 and CID7904 should map to U+FF3B,U+FF3D ([]) not U+FE47, U+FE48 (﹇﹈).

<1edf> <1ee0> <fe47>
CID Adobe-Japan1-UCS2 UniJIS-UTF16-H,V
7899 U+ff08 ( U+fe35,U+ff08v
7900 U+ff09 ) U+fe36,U+ff09v
7901 U+3014 〔 U+fe39,U+3014v
7902 U+3015 〕 U+fe3a,U+3015v
7903 U+fe47 ﹇ U+fe47,U+ff3bv
7904 U+fe48 ﹈ U+fe48,U+ff3dv
7905 U+ff5b { U+fe37,U+ff5bv
7906 U+ff5d } U+fe38,U+ff5dv
7907 U+3008 〈 U+fe3f,U+3008v
7908 U+3009 〉 U+fe40,U+3009v
7909 U+300a 《 U+fe3d,U+300av
7910 U+300b 》 U+fe3e,U+300bv
7911 U+300c 「 U+fe41,U+300cv
7912 U+300d 」 U+fe42,U+300dv
7913 U+300e 『 U+fe43,U+300ev
7914 U+300f 』 U+fe44,U+300fv
7915 U+3010 【 U+fe3b,U+3010v
7916 U+3011 】 U+fe3c,U+3011v

I suggest mapping of CID10244..10262 circled number 32..50 (㉜..㊿) to U+325C..32BF not to ASCII numbers. Because U+325C..32BF are included in Unicode3.2 and JIS X 0213, I guess most of recent Japanese environment should support those code points.

I suggest CID394, 395 (halfwidth katakana glyphs) should map U+30F5..30F6 (ヵヶ) not to U+30AB,U+30B1 (カケ).

<018a> <30ab>
<018b> <30b1>
t-tk commented 5 years ago
t-tk commented 5 years ago

I suggest mapping to the code points of UniJIS-UTF16 not to U+FFFD for the following CIDs.

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx
15429 0x3c45 0xfffd U+9FBD 龽 Unicode5.1(2008)
15431 0x3c47 0xfffd U+9FBC 龼 Unicode5.1(2008)
15434 0x3c4a 0xfffd U+9FBE 龾 Unicode5.1(2008)
20068 0x4e64 0xfffd U+9FBF 龿 Unicode5.1(2008)
20069 0x4e65 0xfffd U+9FC0 鿀 Unicode5.1(2008)
20070 0x4e66 0xfffd U+9FC1 鿁 Unicode5.1(2008)
20071 0x4e67 0xfffd U+9FC2 鿂 Unicode5.1(2008)
20957 0x51dd 0xfffd U+26BD ⚽ Unicode5.2(2009)
20958 0x51de 0xfffd U+27BF ➿ Unicode6.0(2010)
t-tk commented 5 years ago

I suggest CID 12097 might map to U+26BE ⚾ instead of "野球".

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx
12097 0x2f41 0x91ce,0x7403 野球 U+26BE ⚾ Unicode5.2(2009)
t-tk commented 5 years ago

I suggest mapping following CIDs to "CJK Unified Ideographs", "CJK Unified Ideographs Extension B" or "CJK Compatibility Ideographs" not to U+2E80..2EFF "CJK Radicals Supplement" according to JIS X 0213 mapping.

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx
13646 0x354e 0x2ebd ⺽ U+2ebd,U+26951
13849 0x3619 0x2ede ⻞ U+2ede,U+2967f
13852 0x361c 0x2e97 ⺗ U+2e97,U+38fa
13898 0x364a 0x2eca ⻊ U+2eca,U+27fb7
13995 0x36ab 0x2eaa ⺪ U+2eaa,U+24d14
14105 0x3719 0x2e87 ⺇ U+2e87,U+20628
14198 0x3776 0x2ebf ⺿ U+2ebf,U+fa5e
14199 0x3777 0x2ec0 ⻀ U+2ec0,U+fa5d
15398 0x3c26 0x2ea4 ⺤ U+2ea4,U+fa49
15403 0x3c2b 0x2ecc ⻌ U+2ecc,U+fa66
t-tk commented 5 years ago

Unihan data base at Unicode.org provides mapping between ucs and Adobe-Japan including variants. According to the variants mapping, I suggest following mapping by the Unihan data base. Especially mapping of CID21558, 21722, 21933 by Adobe-Japan1-UCS2 seem not appropriate.

(2019-05-12) CID7710,7880 are added.

CID CID hex Adobe-Japan1-UCS2 Unihan data base
14258 0x37b2 0x975c 靜 U+9759 静 C+2665+174.8.6 V+14258+174.8.6
21558 0x5436 0x65df 旟 U+609E 悞 C+14541+61.3.7 V+21558+61.3.7
21722 0x54da 0x69fe 槾 U+66B5 暵 C+17755+72.4.11 V+21722+72.4.11
21933 0x55ad 0x74d8 瓘 U+7152 煒 C+14762+86.4.9 C+14762+178.9.4 V+21933+86.4.10 V+21933+178.10.4
7710 0x1e1e 0x976d 靭 U+9771 靱 C+7152+177.9.3 V+7710+177.9.3 V+13624+177.9.3
7880 0x1ec8 0x9771 靱 U+976d 靭 C+2591+177.9.3 V+7880+177.9.3 V+7971+177.9.3
t-tk commented 5 years ago

I suggest mapping CIDs to code points mapped by the latest UniJIS-UTFxx as following

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx
13691 0x357b 0x7f36 缶 U+26222 𦈢 ExtB Unicode3.1(2001)
13782 0x35d6 0x5ea7 座 U+2b777 𫝷 ExtD Unicode6.0(2010)
14061 0x36ed 0x9580 門 U+95e8 门 Unicode1.1(1993)
14145 0x3741 0x5be8 寨 U+2a9e6 𪧦 ExtC Unicode5.2(2009)
14174 0x375e 0x7c14 簔 U+2b7bd 𫞽 ExtD Unicode6.0(2010)
14188 0x376c 0x7f51 网 U+2053f 𠔿 ExtB Unicode3.1(2001)
14189 0x376d 0x7f51 网 U+2626a 𦉪 ExtB Unicode3.1(2001)
14278 0x37c6 0x9f63 齣 U+2b81a 𫠚 ExtD Unicode6.0(2010)
20088 0x4e78 0x5354 協 U+2b753 𫝓 ExtD Unicode6.0(2010)
20096 0x4e80 0x56c0 囀 U+2b75a 𫝚 ExtD Unicode6.0(2010)
20097 0x4e81 0x56c3 囃 U+2b75c 𫝜 ExtD Unicode6.0(2010)
20125 0x4e9d 0x64a5 撥 U+2b77c 𫝼 ExtD Unicode6.0(2010)
20128 0x4ea0 0x655d 敝 U+207c8 𠟈 ExtB Unicode3.1(2001)
20141 0x4ead 0x66dc 曜 U+2b782 𫞂 ExtD Unicode6.0(2010)
20149 0x4eb5 0x6a05 樅 U+2b78b 𫞋 ExtD Unicode6.0(2010)
20153 0x4eb9 0x6bb1 殱 U+2b794 𫞔 ExtD Unicode6.0(2010)
20156 0x4ebc 0x6dbc 涼 U+9fcc 鿌 Unicode6.1(2012)
20176 0x4ed0 0x75d9 痙 U+2b7ac 𫞬 ExtD Unicode6.0(2010)
20180 0x4ed4 0x76c8 盈 U+2b7af 𫞯 ExtD Unicode6.0(2010)
20194 0x4ee2 0x8077 職 U+2b7c9 𫟉 ExtD Unicode6.0(2010)
20204 0x4eec 0x8449 葉 U+2b7d2 𫟒 ExtD Unicode6.0(2010)
20247 0x4f17 0x990a 養 U+2b765 𫝥 ExtD Unicode6.0(2010)
20256 0x4f20 0x9c75 鱵 U+2b80d 𫠍 ExtD Unicode6.0(2010)
20260 0x4f24 0x9e78 鹸 U+2b817 𫠗 ExtD Unicode6.0(2010)
t-tk commented 5 years ago

I suggest CIDs mapping as follows.

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx-H suggest
8312 0x2078 0x2193,0x2191 ↓↑ U+21F5 ⇵ Unicode3.2(2002) U+21F5 ⇵
12209 0x2fb1 0x21c5 ⇅ U+2b83 ⮃ Unicode7.0(2014) U+21F5 ⇵
t-tk commented 5 years ago

I suggest following mapping to code points mapped by UniJIS-UTFxx not to combination of ASCII numbers. Because U+277F,24EB..24F4 are included in Unicode3.2 and JIS X 0213, I guess most of recent Japanese environment should support those code points.

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx
10514 0x2912 0x0031,0x0030 10 U+277F ❿ Unicode1.1(1993)
10515 0x2913 0x0031,0x0031 11 U+24EB ⓫ Unicode3.2(2002)
10516 0x2914 0x0031,0x0032 12 U+24EC ⓬ Unicode3.2(2002)
10517 0x2915 0x0031,0x0033 13 U+24ED ⓭ Unicode3.2(2002)
10518 0x2916 0x0031,0x0034 14 U+24EE ⓮ Unicode3.2(2002)
10519 0x2917 0x0031,0x0035 15 U+24EF ⓯ Unicode3.2(2002)
10520 0x2918 0x0031,0x0036 16 U+24F0 ⓰ Unicode3.2(2002)
10521 0x2919 0x0031,0x0037 17 U+24F1 ⓱ Unicode3.2(2002)
10522 0x291a 0x0031,0x0038 18 U+24F2 ⓲ Unicode3.2(2002)
10523 0x291b 0x0031,0x0039 19 U+24F3 ⓳ Unicode3.2(2002)
10524 0x291c 0x0032,0x0030 20 U+24F4 ⓴ Unicode3.2(2002)
t-tk commented 5 years ago

I suggest mapping CIDs to code points mapped by the latest UniJIS-UTFxx as following. I guess the current mapping of Adobe-Japan1-UCS2 is for a variant but the latest UniJIS-UTFxx mapping better fits the glyph.

CID CID hex Adobe-Japan1-UCS2 UniJIS-UTFxx
13651 0x3553 U+885e 衞 U+2b7d8 𫟘 ExtD Unicode6.0(2010)
13695 0x357f U+8218 舘 U+fa6d 舘 Unicode5.2(2009)
13724 0x359c U+2363a 𣘺 U+2b78e 𫞎 ExtD Unicode6.0(2010)
13740 0x35ac U+6075 恵 U+fa6b 恵 Unicode5.2(2009)
13780 0x35d4 U+4eca 今 U+2b746 𫝆 ExtD Unicode6.0(2010)
13866 0x362a U+52e2 勢 U+2b751 𫝑 ExtD Unicode6.0(2010)
14064 0x36f0 U+687a 桺 U+2b789 𫞉 ExtD Unicode6.0(2010)
14089 0x3709 U+6881 梁 U+9fc4 鿄 Unicode5.2(2009)
14168 0x3758 U+7953 祓 U+9fc6 鿆 Unicode5.2(2009)
14281 0x37c9 U+242ee 𤋮 U+fa6c 𤋮 Unicode5.2(2009)
20114 0x4e92 U+5ea7 座 U+2b776 𫝶 ExtD Unicode6.0(2010)
20201 0x4ee9 U+83df 菟 U+2b7cf 𫟏 ExtD Unicode6.0(2010)
20240 0x4f10 U+943a 鐺 U+2b7f0 𫟰 ExtD Unicode6.0(2010)
kenlunde commented 5 years ago

@t-tk Today’s CJK Type Blog article, entitled To UVS, Or Not To UVS, should help you to better understand my plans for updating the Adobe-Japan1-UCS2 ToUnicode mapping file.

hatchzo commented 1 year ago

The latest version of the Adobe-Japan1-UCS2 ToUnicode mapping resources addresses everything in that issue.