Avoid "reloading" TTFonts in PostProcessor.__init__

madig commented 3 years ago

Going from https://github.com/fonttools/fonttools/issues/1095, quite some time is spent serializing/deserializing fonts somewhere inbetween compiling the final binary. This slows down the entire process. PostProcessor.__init__ seems to "reload" every font, maybe to clean up loose data? I think it should maybe just work with what it's given?

simoncozens commented 3 years ago

One thing the postprocessor does is rename the glyphs to production names. It's very hard to do this without a save/load, because internally to fonttools, glyphs are referred to by name, not by GID. If you change alef-ar to uni0627 in GlyphOrder, but font["GSUB"].table.LookupList.Lookups[5].Subtable[0].InputCoverage[0] still refers to alef-ar, your font will no longer save. So it's easier to save and reload than to go through every subtable everywhere trying to remap things.

madig commented 3 years ago

Argh. That's what I'm running into locally... So, what do?

anthrotype commented 3 years ago

yes, what Simon said. however note that for VFs I think we do the glyph renaming only on the final VF, not on each master TTF

anthrotype commented 3 years ago

maybe the postprocessor should only do the reloading if it's going to actually rename glyphs, not unconditionally like now

madig commented 3 years ago

There's a few other things that happen, I'll push a branch.

behdad commented 3 years ago

Renaming glyphs is such a misunderstood op. One can build a font, make sure all tables are compiled, then do the renaming glyphs, which would involve just loading and saving the post table...

behdad commented 3 years ago

Or should even be possible to provide the rename mapping at load time.

behdad commented 3 years ago

So, a lot of this would be a non-issue if we were dealing with immutable data types, such that any modification would involve copy-on-write duplication... Many functional languages work that way. In Python, I'm sure there's some dark black magic with metaclasses to get some of that, but there will be dragons. Also, there's no way to get compile-time errors for violations with Python.

Anyway, attrs2 on this: https://www.attrs.org/en/stable/examples.html#immutability

madig commented 3 years ago

I'd take some immutability, but the issues I'm seeing are data normalization (e.g. post.italicAngle of 0.0 does not get normalized to 0 like when reloading), data sorting (e.g. name table being in a different order) and glyph names staying as they are :neutral_face:

behdad commented 3 years ago

I see.

madig commented 3 years ago

So, what do? Ideally the compilation pass spits out something that we don't need to reload to make it mergable. Or we reload as little as possible? Any insights appreciated over at https://github.com/googlefonts/ufo2ft/pull/486.

behdad commented 3 years ago

So, what do? Ideally the compilation pass spits out something that we don't need to reload to make it mergable.

Yes, I'm happy to work on making that reality.

Or we reload as little as possible? Any insights appreciated over at #486.

Okay great. That gives me a point to start. Let's continue there.

madig commented 3 years ago

So, going from https://github.com/googlefonts/ufo2ft/pull/486, I stumbled over the following when reloading strictly only for renaming or post version changes:

maxp table missing values
various post attributes being filled with verbatim Python types rather than the final value
the post extraNames list not being filled in
the STAT table not having AxisValueCount filled in
the name table being unsorted

Doing in the following in the middle of PostProcessor.process makes tests pass:

        # Sort the name table in case varLib added new entries.
        if "name" in self.otf:
            self.otf["name"].names.sort()
        if "maxp" in self.otf:
            self.otf["maxp"].compile(self.otf)
        if "STAT" in self.otf:
            data = self.otf["STAT"].compile(self.otf)
            self.otf["STAT"].decompile(data, self.otf)
        if "post" in self.otf and self.otf["post"].formatType == 2.0:
            self.otf["post"].extraNames = []
            self.otf["post"].compile(self.otf)

behdad commented 3 years ago

the post extraNames list not being filled in

We should completely remove extraNames. It serves no functional purpose.

behdad commented 3 years ago

the STAT table not having AxisValueCount filled in

Just make the STAT builder set it. The Count values in all of otData are also redundant and could be removed if we are willing to take the breakage and update code depending on it.

khaledhosny commented 3 years ago

The Count values in all of otData are also redundant and could be removed

Yes, please! It always seemed odd to have a separate Count field when the data is stored in a Python list anyway.

behdad commented 3 years ago

I feel like a fonttools API break is gaining support...

madig commented 3 years ago

Would this conflict with fT's ability to carry and serialize meaningless/wrong input data in loaded fonts or is it sufficiently high level that that makes little sense? Is that even a design goal?

khaledhosny commented 3 years ago

These *Counts are written as comments in TTX, so probably can be kept regardless of the API change. In Python, LookupCount would be replaced by len(Lookup) and so on. May be the first step would to stop using these *Counts internally but set them when decompiling, so code building tables from scratch don’t have to set them, and later be removed entirely.

behdad commented 3 years ago

Would this conflict with fT's ability to carry and serialize meaningless/wrong input data in loaded fonts or is it sufficiently high level that that makes little sense?

The count values are always ignored when compiling the font.

Is that even a design goal?

Nope. We don't even pretend.

These *Counts are written as comments in TTX, so probably can be kept regardless of the API change. In Python, LookupCount would be replaced by len(Lookup) and so on. May be the first step would to stop using these *Counts internally but set them when decompiling, so code building tables from scratch don’t have to set them, and later be removed entirely.

Or replace them with properties that warn when accessed; set a deadline for removing them, and advise code to switch. The replacement code can even be offered in the warning.

googlefonts / ufo2ft

Avoid "reloading" TTFonts in PostProcessor.init #485

googlefonts / ufo2ft

Avoid "reloading" TTFonts in PostProcessor.__init__ #485

Avoid "reloading" TTFonts in PostProcessor.init #485