fontmake glyph loop in Rust

rsheeter commented 2 years ago

Rendered view.

Holding in PR for a few days to facilitate collection of feedback.

simoncozens commented 2 years ago

OK, some thoughts:

I think this is doable, but it underestimates how tightly coupled the existing code is. There isn't such a thing as "the glyph loop" per se, but building the glyf table is one of the (many) things that the "outline compiler" does.
We have a function in ufo2ft (compileInterpolatableTTFs) where we have all the UFOs at once as input and we expect a bunch of TTFs as output, so that's the spot to target. But each of the UFOs is fed individually to the outline compiler, which produces an entire independent TTF. (Currently with features, but we can split that out to a separate stage with my PR.)
So there is a considerable mismatch between what you're asking and what we're able to do currently. Let's say we write a bit of Rust which takes n glif files and produces a glyf table entry and a gvar table entry. There is really nowhere simple to plug this into the existing ufo2ft code. Either you have a single UFO and you produce an entire glyf table for that UFO as part of the process of producing a TTF, or you have a bunch of UFOs and you produce a bunch of TTFs. At no point do we have the granularity of working on n glyphs in parallel, under the current architecture.
I'm totally up for replacing the whole outline compiler with some Rust which produces a complete TTF (again, assuming we add features to that later). But that would hardly be incremental.
I suppose one thing we could try is having compileInterpolatableTTFs call out to some Rust code which precomputes all the glyf information for all of the UFOs at once, and then uses that glyf information to instantiate a mixed Rust/Python BaseOutlineCompiler subclass, which in turn uses it to set up all the other tables which use the metrics etc.
That can be step one, and then step two can be doing the same but also computing gvar information in Rust and sending that back instead of doing the merge step.

anthrotype commented 2 years ago

I left some comments in line. Overall I'm strongly in favour of this, and of doing this incrementally and in such a way that benefits existing consumer of fontmake.

There isn't such a thing as "the glyph loop" per se

Yes, we have several for loops that involve glyphs and we'd like to break up/parallelize. One that isn't mentioned here directly are all the ufo2ft's "filters" that pre-process UFO glyphs (e.g. decompose components, convert cubic to quadratic curves, in addition to custom user-defined filters) before they get translated to truetype or CFF glyphs. Filters are also run serially at the moment: not just one filter at a time, the result of the previous filter being the input to the next, but also, within each filter, one glyph at a time.

It's true the current code is at the same time tightly coupled, while also spanning different projects (ufo2ft for master UFOs=>glyf compilation, fonttools varLib for master glyfs=>gvar), which makes integrating this proposed "oxidized glyph loop" more complicated than it appears.

One way to tackle this at first could be to keep the current serial python pipeline to produce only a scaffolding of the VF, with all glyphs blanked out (a ufo2ft filter could even literally nuke all the contours/components before proceeding as usual). Then (or even at the same time) also have this new Rust tool or method take a designspace with a bunch of UFO masters and produce whole glyf and gvar tables (internally parallelizing per glyph) and finally use fonttools to stick those in the VF at the end.

Or alternatively, which I think is what Simon suggested ealier, we could modify ufo2ft's compileInterpolatableTTFsFromDS method to call out into Rust to build the master TTFs' glyf tables, while keeping the rest of the code which depends on them (e.g. for the bounding boxes, sidebearings); then when ufo2ft goes on to call fontTools.varLib.build method, pass exclude=["gvar"] parameter and subsequently call some other Rust API to build the latter and insert in the final VF.

simoncozens commented 2 years ago

I implemented what I'm calling "step one" of this (replace Python/C glyph table generation with Rust, not gvar yet) using this PyO3 module and this patch to ufo2ft.

I found a medium-sized speedup, around 10-15%. Which is not bad! Making code 10-15% faster is actually a pretty good optimization... it's just not what we were expecting.

Some observations:

Obviously it's proof-of-concept code, maybe you could squeeze a bit more out, but you're not going to see an order of magnitude here.
We forgot about the bit which adds a .notdef to fonts which don't have them, meaning that the on-disk UFO is not the same as the in-memory UFO. There are a few ways around this:
- Save the UFOs in Python then reopen them in Rust. (Urgh, much overhead.)
- Marshal in-memory Python UFO objects into in-memory Rust UFO objects. (I'm guessing, but probably even more overhead; norad loading is fast.)
- Add the .notdef in Rust. (I have code for this!)
- Do what I do for the purposes of the proof-of-concept and just fail messily on fonts which don't have their own .notdef.
There is a bigger issue than just .notdef here; it reminds us of the filters issue - that Python may manipulate the UFOs before compiling them, and we should probably have a better plan for that.
Something went wrong with bracket layers in a Certain Big Font. I decided to ignore it and use other fonts instead, since this is just a proof-of-concept.
I decided to pass back each compiled glyph as a fontTools Glyph object rather than return the whole glyf table as a big binary dump. This is because ufo2ft does further processing based on the contents of the glyf table, and so if it was a big binary dump it would have to de-compile it to Python objects anyway.
We could probably save a bit more time by implementing step 2, which is also returning the gvar table. This isn't something that Python does further processing on, so it could just be jammed right into the font. However, since this happens deep inside of fontTools.varLib.merge instead of ufo2ft, it's not immediately clear how to turn off specifically gvar processing in the merger and do it ourselves.

But I think the biggest observation is that we still don't really have a sense of how long each sub-task within font compilation takes. We're prodding at bits we think are going to be bottlenecks and optimizing them, but I don't feel like we know where the bottlenecks really are. I know @madig did a lot of profiling a year or so ago, but rather than function-level profiling, I think it is more useful to mark the start and end of discrete operations ("load in the UFOs", "convert the curves", etc.) and see what proportion of the build time they take.

rsheeter commented 2 years ago

10-15% total correct? Can you readily compare the specific operation that was ported to Rust vs it's Python equivalent?

googlefonts / oxidize

fontmake glyph loop in Rust #33