fonttools / ufoLib2

A library to deal with UFO font sources.
Apache License 2.0
25 stars 14 forks source link

Implement internal charactermapping.plist from UFO roadmap? #55

Open madig opened 4 years ago

madig commented 4 years ago

The UFO roadmap mentions the charactermapping.plist, to unambigiously assign Unicode values to glyph names across all layers (UFO spec discussion: https://github.com/unified-font-object/ufo-spec/issues/77). I wonder if it would be worth implementing this internally before the UFO spec gets around to it, i.e. keep score internally as layers are loaded (with public.default taking precedence over all other layers). The spec already allows a library to do whatever when two same glyph names disagree on the Unicode value.

anthrotype commented 4 years ago

No, you can't store additional files other than the public ones inside a UFO package (unless in the data dir), or else other implementations may drop them. Better to move forward with that spec proposal. We should also finish that PR that allows minor version to UFO, so we can start doing more incremental changes. https://github.com/fonttools/fonttools/pull/1786

madig commented 4 years ago

I mean construct a character mapping internally, not write it out.

anthrotype commented 4 years ago

Oh, ok, sure then. If defcon has some method to get the character mapping, we should try to name it the same.

anthrotype commented 4 years ago

basically, something similar to defcon's UnicodeData class https://github.com/robotools/defcon/blob/master/Lib/defcon/objects/uniData.py

which is exposed to the font as Font.unicodeData property

Ideally, read-only using an immutable MappingProxyType

anthrotype commented 4 years ago

hm I don't like that UnicodeData class very much, actually. It maps from characters to multiple glyphs which is impossible and wrong, only occurs as result of unicode element currently belonging to the GLIF and thus the possibility that multiple GLIFs map to the same unicode (impossible in a valid font's cmap). It should map each character to a single glyph, not list of glyphs. And we should also think how to expose unicode variation selectors (https://github.com/unified-font-object/ufo-spec/issues/79).

madig commented 4 years ago

Yes, it would bend v3 conventions to enforce one Unicode value per glyph name, but the spec allows that.

anthrotype commented 4 years ago

It's one glyph per unicode value, not one unicode value per glyph. Multiple unicodes can map to the same glyph, not viceversa.

madig commented 4 years ago

Err, yeah.

anthrotype commented 4 years ago

We can expose a Font.getCharacterMapping() method that maps unicode (int) to glyph name (str) -- not a list of str (for the reasons I explained above). We can actually raise an error if there are conflicting mappings among the glyphs' unicodes (i.e. the same unicode mapped to different glyphs).

Even the Layer object could have a getCharacterMapping() since the unicodes are stored in each Glyph (which is what we want to get rid of..). But let's not do that. Font.getCharacterMapping() is enough, and will return the mapping for the default layer only.

anthrotype commented 4 years ago

this is basically ufo2ft.util.makeUnicodeToGlyphNameMapping

https://github.com/googlefonts/ufo2ft/blob/66510c8128f9a447617c8c4d6d5871ef4577f74f/Lib/ufo2ft/util.py#L184-L206