googlefonts / glyphsets

Apache License 2.0
78 stars 18 forks source link

Missing codepoints in our Glyphsets WIP #235

Open vv-monsalve opened 2 weeks ago

vv-monsalve commented 2 weeks ago

Working on the subsetting of Playpen Sans pers script fonts, I've noticed some of the codepoints are not included in any of our Glyphsets.

After cross-referencing the Glyphsets with the gflanguages definitions, I've seen most cases correspond to auxiliary glyphs.

However, I'll collect the auxiliary plus other extra ones I'm identifying here so we can review them later on.

Codepoint Glyph Language Comment
0132 IJ nl_Latn Auxiliar glyph for this language
0133 ij nl_Latn Auxiliar glyph for this language
017F longs fr_Latn Auxiliar glyph for this language
013F Ldot ca_Latn Auxiliar glyph for this language
0138 kgreenlandic kl_Latn Auxiliar glyph for this language
0312 commaturnedabovecomb lv_Latn necessary for gcommaaccent part of Latn Core
FB01 fi liga NA Previously in old Latin Plus
FB02 fl liga NA Previously in old Latin Plus
00A4 currency NA Previously in old Latin Plus
2009 thinspace NA Could be part of GF Plus
25CC dottedCircle NA It's a requested glyph but is not included in any Glyphset
yanone commented 2 weeks ago

First up, I implemented glyphsets find ſ (or glyphsets find 0x017F) which previously only existed in the /scripts folder and extended it to include auxiliary characters and glyphsets. This should help in the authoring work in the future. (Install with pip install -e . before a new release is cut)

Auxiliary characters are not included in glyphsets except when instructed in the .yaml file, see African, and if we generally change that, we'll include a ton more characters.

If they must be included, there are two good options: Either we raise their status in the language definitons from auxiliary to base, or we include them in .stub.glyphs files.

The first five above I would personally not include in glyphsets. To my knowledge, the contemporary conduct is to exclude encoded ligatures and have locl substitutions and discretionary i_j ligatures, and generally f_i ligatures in either liga or dlig rather than the encoded legacy ligatures. The longs is a historic character, correctly not included in base IMO.

That being said, any font may choose to include any of these characters anyway despite them not being included in official glyphsets, so the glyphsets are never a barrier here.

The currency I can definitely put back into Latin Plus and the dottedCircle I can also put into glyphsets. Do you have an idea into which it belongs? We could also choose to implement an additional definition layer per script so that certain glyphs will be included into all glyphsets of a certain script, so the dotted circle into all Arabic, for example.

And finally, the new glyphsets find command will help with answering the the SSA question. I could not find the three characters in any SSA language.

yanone commented 2 weeks ago

But we could put longs into one of the extended Latin sets. I include it in my own fonts.

yanone commented 2 weeks ago

We can also make .stub files for all glyphsets (for dottedcircle maybe?)

vv-monsalve commented 2 weeks ago

I agree with most of the above points. I'll continue feeding the list as I review this font, and we can have a final revision afterward.

The additional layer per script idea sounds good and is in line with what I mentioned yesterday about finding a way to define the glyphs necessary for all the scripts in one place so we do not repeat them on each glyphset.

vv-monsalve commented 1 week ago

The list was longer than expected, so I filed an issue in the upstream repo to receive their and Denis' feedback, which could better inform our decisions about our Glyphsets.

I left above a short list of the ones that we might need to add to some Glyphset and a couple of examples of auxiliary glyphs.