LettError / glyphNameFormatter

Generate list of glyphnames from unicode names.
BSD 3-Clause "New" or "Revised" License
75 stars 9 forks source link

Georgian support #80

Closed typemytype closed 5 years ago

typemytype commented 5 years ago

proposal from @typotheque

Georgian is complicated, it used to be a unicase aphabet (Mkhedruli) and now it includes additional caps under Georgian Extended block, which are called Mtavruli. And then there are ancient styles used for illuminated manuscripts Asomtavruli, and another style (Nuskhuri) used for liturgical purposes.

To keep the names short, something like this could work:

Mtavruli

1C90    AnGeor
1C91    BanGeor
1C92    GanGeor
…

Mkhedruli

10D0    anGeor
10D1    banGeor
10D2    ganGeor
…

Asomtavruli

10A0    AnGeorOld
10A1    BanGeorOld
10A2    GanGeorOld
…

Nuskhuri

2D00    anGeorOld
2D01    banGeorOld
2D02    ganGeorOld
…

or, if we want to be more accurate than

AnGeorMtavruli
AnGeorMkhedruli
AnGeorAsomtavruli
AnGeorNuskhuri
…
typotheque commented 5 years ago

Thanks. All these four scripts are phonetically identical, so we should have the phonetic part of the name, and then to differentiate the style of Georgian where it came from. The most common style Mkhedruli is usually the default, as it used to be caseless, now there is a new Unicode addition for Modern Georgian caps, so Mkhedruli is now understood as a lower case. We can see Mkhedruli & Mtavruli as Modern Georgian, and Asomtavruli & Nuskhuri as ancient styles.

typemytype commented 5 years ago

currently missing https://en.wikipedia.org/wiki/Georgian_Extended

in https://github.com/LettError/glyphNameFormatter/blob/master/Lib/glyphNameFormatter/data/unicodeBlocks.txt

this has to be updated too

lianghai commented 5 years ago

I personally recommend that ISO 15924 codes should be considered, in which there are separate codes Geor for modern Georgian (Mkhedruli and Mtavruli) and Geok for Khutsuri (Asomtavruli and Nuskhuri).

If really want to stick to registered OpenType script codes though, just use Khutsuri (or some abbreviation of it) as a shared suffix for Asomtavruli and Nuskhuri.

Note how hard Georgian people argued to the Unicode Technical Committee for encoding Mtavruli as the uppercase version of Mkhedruli. And how Asomtavruli and Nuskhuri do enjoy a (probably pretty recognized) use case of being used as a single bicameral script.

Calling out all four scripts/styles’ specific names in glyph names is an overkill.

typotheque commented 5 years ago

I wasn't aware of ISO 15924, and used OpenType Script tags before. I agree with Liang that using Geor and Geok simplifies the issue, and is the right way forward. So we and up with something like this:

1C90 AnGeor (Mtavruli) 10D0 anGeor (Mkhedruli)

10A0 AnGeok (Asomtavruli) 2D00 anGeok (Nuskhuri)

twardoch commented 5 years ago

OpenType generally follows the ISO script codes, but, well, not always. It's really hard to tell who decides about adding new OT script codes. I think someone within MS, but it's not really a formalized proces afaik. It's likely however, that geok will be added to OT at some point.

twardoch commented 5 years ago

Ps. Note that OpenType actually predates ISO 15924. There were script assignments (without codes) in Unicode, OpenType derived from that and created the script codes, with some modifications, then ISO 15924 took the OT codes but uppercased the first letter and added a few (and strictly tied them to Unicode, which OT did not), and then OpenType decided to follow ISO as a principle but did not change the old assignments.

lianghai commented 5 years ago

@twardoch:

… and strictly tied them to Unicode, which OT did not …

Mmm, ISO 15924 doesn’t really appear to be strictly tied to Unicode, as it’s got its own way of categorizing scripts (for example on the cases of Georgian and Syriac).