A near-term focus on compilation of layout tables

cmyr commented 2 years ago

proposal

Following a recent discussion led by @chrissimpkins, and focused on the desire for a compiler capable of more quickly compiling large multi-axis variable fonts, I would like to propose a new preliminary goal for my current oxidize work. Specifically, I would like to focus on providing a working compiler for the OpenType layout tables, including table packing. I believe this goal occupies a sweet spot: if successful, it would both serve as solid proof that the oxidize project's compilation goals are viable, and it would also fill in what is currently the main missing component in having a Rust toolchain capable of compiling large multi-axis variable fonts.

This work would not be a major departure from my current focus; it would mostly serve as a clear goalpost for guiding that work, as well as a useful forcing/focusing function for deciding priorities. Specifically, this would mean:

compilation-friendly types for everything in GPOS/GSUB/GDEF (@simoncozens should I also be including BASE/JSTF/MATH? anything else?)
serialization code for writing all of these types into their binary representations
A port of the harfbuzz table-packing algorithm to Rust
Updating the fea-rs FEA parser to use these new tools in order to write out compiled, packed tables.

Of this work, only the last item is not explicitly part of oxidize's current scope.

Ultimately I feel like this proposal helps provide clear goals and clear success criteria for my current work, which I think would be very valuable.

timeline

I am still thinking about and focusing heavily on code generation. I currently have a hand-written implementation of the types and serialization code for GDEF, which I'm using to as a template for what code I should be generating. This is now more or less working (I can roundtrip existing GDEF tables).

Codegen won't work everywhere: certain types have idiosyncrasies that require hand-written implementations. A major unknown is exactly how frequently this is the case, and this variable will impact how long this step takes. Ultimately these are the most complicated tables in the spec, but I would hope to have the basic compilation code working in 2-4 weeks of work (I'm largely on vacation for the next two weeks).

The repacker is the next major item: it is about 1000 lines of C++, but it should be possible to port it fairly directly, so I would hope to do that in a couple of weeks.

Finally we will need to integrate this into fea-rs. Most of this should be straight-forward, but iirc fea-rs does not currently handle chain rules, and it also does not handle the proposed additions to the FEA spec that add support for variable fonts. I'm not totally sure how much of a hassle that will end up being; similarly I expect that working on this will also help identify any issues with the API of the compilation types.

All in all, I believe that in 2-3 months we can have an oxidize-based, capable-if-rough-around-the-edges compiler for layout tables, which combined with @simoncozens' work on variable font compilation would give us the basis for an all-Rust font compiler.

finally

Does this make sense for everyone? Please let me know if there are any questions or concerns, or if anything should be clarified. I think this direction offers a good balance of "prove out some of these ideas" as well as "work towards important concrete goals"; the latter of these has been missing for me a bit lately, and I think it would be helpful.

rsheeter commented 2 years ago

solid proof that the oxidize project's compilation goals are viable

I'm uncomfortable with this because I don't believe there is much doubt here. We might produce something useful but we aren't really taking any risk out of the project by showing we can compile fonts quickly.

Edit:

combined with @simoncozens' work on variable font compilation

Can you add links? - not all readers will be familiar with this work

cmyr commented 2 years ago

solid proof that the oxidize project's compilation goals are viable

I'm uncomfortable with this because I don't believe there is much doubt here. We might produce something useful but we aren't really taking any risk out of the project by showing we can compile fonts quickly.

To clarify, the question I hope to answer is less "can we compile things" than "can we generate the serialization code and types automatically enough of the time to justify the work"?

Edit:

combined with @simoncozens' work on variable font compilation

Can you add links? - not all readers will be familiar with this work

The main projects of not here are fonticulus and fonttools-rs, which correspond approximately to fontmake/fonttools in pythonland.

rsheeter commented 2 years ago

Full disclosure: setting aside the hard parts (subsetting and shaping) for 3 months makes me a little nervous. I don't doubt there is plenty of work to figure out how best to compile but I'm pretty much convinced it's doable one way or another.

"can we generate the serialization code and types automatically enough of the time to justify the work"?

Can you elaborate as to why this is still a concern? What leaves us worried after being able to support GDEF? Is there a smaller experiment that would answer this question? Dare I hope we have enough available experts around to simply show them what our generation can do and then ask? - Behdad comes to mind :)

combined with [...]fonticulus and fonttools-rs, which correspond approximately to fontmake/fonttools in pythonland

They correspond approximately but to my understanding are nowhere near matching in scope or battle-testing; let's not sell FontTools short.

Perhaps I miss something obvious but I don't find it clear how this combination would work given that last I looked fonttools-rs used hand-crafted table IO that would need to be ripped out. That seems expensive, potentially costing as >= starting clean on your shiny new table IO and aiming to match fonttools with careful testing (ttx matching ala hb-subset is probably the best practice) to confirm it.

Proposal

Thinking "aloud," what if we sought to do two things, aiming to spend hand wavily half the time on each:

Demonstrate we can do some key part of subsetting with perf somewhat close to that of hb-subset doing the same work
- Naively I imagine feeding in a very simple font to each, carefully having subset out anything unsupported beforehand
- @garretrieger can advise us on what a good subset of subsetting to target is
Demonstrate we can generate enough of serialization + types to justify doing so, but more targeted/limited in scope than in OP

simoncozens commented 2 years ago

that last I looked fonttools-rs used hand-crafted table IO that would need to be ripped out

That hasn't been the case for quite a while. All the table IO is abstracted behind proc-macros and Colin has done a very good job of separating table interaction (fonttools-rs) from binary parsing/generation (otspec). It's only the otspec part which would "need" to be replaced. (And when I say "need", otspec already does what we want it do, which is generate binary font tables. Ripping out something which works and replacing it with something else which works counts as more of a "want" than a "need" for me...)

(As far as compiling outlines etc. using fonticulus is concerned; of course, you'll want to use oxidise for binary generation in the feature parser side.)

rsheeter commented 2 years ago

Is table IO able to use the same table IO Colin is working on for oxidize or would we have multiple implementations?

All the table IO is abstracted behind proc-macros

I'm still seeing things like https://github.com/simoncozens/fonttools-rs/blob/main/src/tables/fvar.rs#L91?

rsheeter commented 2 years ago

Another idea to confirm we can read/write might be rust ttx, naturally proceeds table by table and validates against fonttools.

simoncozens commented 2 years ago

I'm still seeing things like https://github.com/simoncozens/fonttools-rs/blob/main/src/tables/fvar.rs#L91?

True. OpenType is messy. Tables sometimes require out-of-band and/or out-of-order information to be properly deserialised.

rsheeter commented 2 years ago

@cmyr above seems to suggest a convenient way to spot good scenarios to see if your table IO can handle w/o hand-crafted code. @simoncozens what specifically causes fvar to require hand-crafted parsing? Edit: actually can we enumerate ALL the tables that required hand-crafted handling and why?

behdad commented 2 years ago

Another idea to confirm we can read/write might be rust ttx, naturally proceeds table by table and validates against fonttools.

+1

garretrieger commented 2 years ago

An important thing to note if you're planning on porting the harfbuzz table repacker algorithm is that you'll most likely need to add table splitting and extension lookup promotion which is not currently implemented in harfbuzz. They aren't really needed for the subsetting only case, but will very likely be needed for the compilation use case.

rsheeter commented 2 years ago

@cmyr wdyt about Rust ttx to and from xml as a test case?

cmyr commented 2 years ago

Sorry, in transit the past few days, but I'm in front of the computer today.

going last to first:

does ttx care about things like the ordering of subtables, or about the particular format of a given table? these are places where we might end up with equivalent functionality, but where a simple comparison would differ.
I will +1 the fact that certain tables are going to require manual handling, and it's mostly a matter of trying to minimize this. One example on the parse side is handling ValueRecord in GPOS: generally the ValueFormat is stored higher up in the tree, and we don't know how to interpret the bytes of a ValueRecord (or an array of them) without knowing the format. For instance, in PairPosFormat1 the 'PairValueRecord' table cannot deserialize itself, since it requires knowing the two format flags stored in it's grandparent.
Another issue, in the compilation case, is that I often don't think it makes much sense to construct/manipulate some tables directly. Take for instance a GDEF AttachList:

AttachList table

Offset16 coverageOffset uint16 glyphCount Offset16 attachPointOffsets[glyphCount]

It makes a lot of sense, during compilation, to just represent this as a sorted_map<glyphId, AttachPoint>; in particular this enforces the invariant that the list of glyphs and the number of AttachPoint tables line up, and are correctly associated; but this is difficult to autogenerate.

I'm going to add a followup reply shortly addressing @rsheeter's earlier larger comment, but don't want to hold up the discussion on that. :)

cmyr commented 2 years ago

Full disclosure: setting aside the hard parts (subsetting and shaping) for 3 months makes me a little nervous. I don't doubt there is plenty of work to figure out how best to compile but I'm pretty much convinced it's doable one way or another.

I'm happy to hear these concerns, and I think this is something we should talk about. A major thing I have been struggling with over the past few months is the sort of "blind men and the elephant" aspect of it, where it feels like different participants in the conversation have different specific focuses and concerns, and it is not always clear to me how best to balance these.

To try and enumerate these goals (coarsely):

fast & safe parsing, suitable for use by a shaper
rewriting & modifying existing font files, e.g. for subsetting or instancing
serializable types for use by a font compiler

As for specific goals, I think it makes sense to address shaping & subsetting separately.

for shaping: I'm confident that if we have efficient parsing types, we can build a shaping implementation on top of that. I think that @dfrg has demonstrated this; his hand-written code looks very similar to the code that I've been generating.

For subsetting, things are a bit more complicated. Subsetting is complicated. On the one hand, I am confident that if needed, we can have a hand-written subsetting implementation that works very much like hb-subset: basically it would only use the parse types, which would have 'subset' methods where they basically write out their subset-selves directly, streaming the bytes into a provided buffer, and then we would run a repacker on those serialized objects.

My plan A, though, would be to build subsetting on top of the serializable types; basically you parse the file, convert the tables from their read-only representations into some set of allocating, rust-native types (structs, maps, vecs, etc), do your subsetting on these, and then serialize them, using autogenerated serialization code. This will be slower than the first approach, but it might not be much slower (if most of the time is spent repacking, and that implementation is identical) and it should be much less hand written code, and generally more maintainable/understandable.

That said, I think that doing a subsetting implementation that is built on top of general-purpose compilation types would be a good test case, and would teach us a lot: we would get a good sense of performance, ergonomics, and correctness, and we would also need to have both parsing and serialization working.

proposal

We do a subsetter that only subsets one of GPOS/GSUB, and we check the generated table to what comes out of hb-subset, using ttx or something else for the comparison.
this uses generated serialization types
this will be slower than hb-subset, but it will be very useful to know if it is 'much slower' or 'very modestly slower'
this also helps push the rock up the hill in terms of getting us closer to the other goal of compiling otl tables.

Qs:

are there any annoyances with setting up harfbuzz to only subset a single table? do we maybe just have a preceding pass that drops other tables?
one thing that's an unknown to me is the bit with the 'subset plan', especially with constructing the closure over the inputs. Maybe I should have a chat with @garretrieger about this? I'd love to not worry about this for now, and maybe it would be a no-op in a font that didn't have GSUB?

behdad commented 2 years ago

one thing that's an unknown to me is the bit with the 'subset plan', especially with constructing the closure over the inputs.

Glyph closure is the task of expanding the glyphset to include all glyphs reachable from an initial glyphset. It's a rather tedious task when it comes to GSUB, but that's not the only place. Other tables that contribute to the glyph closure are glyf, CFF, math, COLR, and possibly other tables.

googlefonts / oxidize