unified-font-object / ufo-spec

The official Unified Font Object specification source files.
http://unifiedfontobject.org
175 stars 30 forks source link

Glyph hash spec is different from some implementations #224

Open jenskutilek opened 10 months ago

jenskutilek commented 10 months ago

When I adapted the HashPointPen for usage in ufo2ft, it was discussed and improved in https://github.com/fonttools/fonttools/pull/2005, but the UFO spec hasn't been updated to reflect the changes.

So there are now at least two different implementations of the glyph hash calculation, one in Adobe's psautohint, the other one in FontTools.

https://github.com/unified-font-object/ufo-spec/blob/115eea94d1aa3f1e3ace5edc9f8f5b747fa07394/versions/ufo3/glyphs/glif.md?plain=1#L436-L458

What's the way to resolve this? My vote would go to updating the spec to the algorithm used in the FontTools HashPointPen.

benkiel commented 10 months ago

Hey Jens, that part of the spec was written by Adobe (@readroberts iirc). I am happy to get it updated, but this should be worked out with Adobe also (@skef @kaydeearts @miguelsousa). It might be useful to give us a tldr; digest of what is different from ps/otfautohint and fontTools here, and where it is off of the written spec (or, a PR that folks can debate).

The other question would be how to handle backwards compatibility, if that is needed (not sure it is?)

skef commented 10 months ago

I'm not seeing Adobe raising significant objections to changing this aspect of the UFO spec, as long as its the actual algorithm that is documented rather than a pointer off to the HashPointPen code.

On that front: we should probably confirm with fontTools that either things are currently as they want them to be for the foreseeable future, or see if they're willing to add a parameter (or whatever) that ensures that pen matches the UFO spec, in case they want to support other algorithms at some point.

The current form of psautohint is the Python-only otfautohint port in AFDKO. The hash algorithm is implemented in this object. I haven't looked into whether it still matches the UFO spec.

jenskutilek commented 10 months ago

Here's a summary of the changes from https://github.com/fonttools/fonttools/pull/2005:

The change regarding the composite glyphs, which were decomposed in the Adobe implementation, was to facilitate use of the HashPointPen in checking if TrueType instructions match the outline, as in the UFO glyph’s public.truetype.instructions.

Example outputs (without applying the sha512 hashing):

A simple TTF glyph:

w626l335+458o327+484o313+535q308+559l306+559o301+535o287+483q280+459l210+247l405+247|l480+0l434+144l180+144l133+0l2+0l228+675l397+675l624+0|

A composite glyph:

w500[l0+0l10+110o50+75o60+50c50+0|(+2+0+0+3-10+5)]

A nested composite glyph:

w500[[l0+0l10+110o50+75o60+50c50+0|(+1+0+0+1+0+0)](+2+0+0+3-10+5)]

A glyph with outline and component:

w500l0+0l10+110o50+75o60+50c50+0|[l0+0l10+110o50+75o60+50c50+0|(+2+0+0+2-10+5)]
jenskutilek commented 10 months ago

The decomposing of composite glyphs, as the Adobe version does, makes sense for the hinting of CFF glyphs, where it doesn't matter how the outline ended up in the glyph.

The change in the FontTools implementation proved unsatisfactory now (https://github.com/fonttools/fonttools/issues/3421) because of how the transform values are stored in UFO vs. TTF. UFO can use arbitrary precision for any value, but TTF quantizes the transformation matrix elements to F2Dot14 values. This quantization also must be done in the stored hash if you want to compare glyphs between UFO and TTF, which is necessary in how the hash is used for TrueType hinting.

Maybe we need different requirements for how to build the hash depending on whether it is used in PS or TT hinting?

skef commented 10 months ago

@jenskutilek I thought about this, especially because at first glance of the description I wondered if the new hash was only on the "local" composite information rather than also taking the components into account. But the components are included, they're just not "unpacked".

The only relevant gap between the two hashes would be if the patterns of compositing changed but the component outlines, including their ordering, did not. From the perspective of CFF(2) you'd get a false negative on the identity check. I think that's fine -- there would just be some extra calculation in such cases.