Add subsetting experiments document

cmyr commented 2 years ago

Here are the results of my attempt to build a minimum viable illustration of a subsetter, on top of my oxidize work.

TL;DR: I'm happy with these results, and when @rsheeter is back later this month I suggest we get everyone together and try to figure out where we want this work to go, and what our priorities are.

In addition to this, I'm going to write up a separate short document focused specifically on summarizing my thoughts on packing & repacking, will share that later this week.

simbleau commented 2 years ago

The design I've chosen here makes clear trade-offs; in particular it is trading off some speed in favour of improved ergonomics and robustness (it relies on much less hand-written code).

If this trade-off ends up being unacceptable, it will always be possible to write a more bespoke implementation, which would operate directly on the parse types, and write out bytes directly, very similar to how hb-subset works. This would be more work and more error-prone, but it is a viable escape hatch

Awesome results. I think you proved the separation of high-performance parse types and compilation types was a design choice that paid off.

You mentioned oxidize is roughly 50% slower than hb-subset and that you have made clear trade-offs.

I'm interested in your speculation. Would you speculate the performance lost could be reclaimable without having to reduce ergonomics? Could it be better? This is probably asking for too much speculation, but it may be worth justifying ahead of the sit-down with your team if these trade-offs will need to compromise oxidize's goals. A factor, for sure.

How likely is it that equal or better performance could be achieved without having to compromise goals (e.g. safety), or operate directly on parse types (side effects)?

cmyr commented 2 years ago

I'm interested in your speculation. Would you speculate the performance lost could be reclaimable without having to reduce ergonomics? Could it be better? This is probably asking for too much speculation, but it may be worth justifying ahead of the sit-down with your team if these trade-offs will need to compromise oxidize's goals. A factor, for sure.

How likely is it that equal or better performance could be achieved without having to compromise goals (e.g. safety), or operate directly on parse types (side effects)?

I think that any implementation that involves converting from the parse to the compile types is going to be at a close-to-insurmountable disadvantage compared to something that works with the parse types directly; it's just too much of a cost. That said, if we were to discard some of the current assumptions about how the subsetter should run (that it should be a single-shot tool that takes a raw font file as input, for instance) than I do believe we could produce a tool that would have amortized per-operation runtime lower than that of the current subsetter.

googlefonts / oxidize

Add subsetting experiments document #31