Compile variable features

simoncozens commented 2 years ago

This is a work in progress. The idea is this: when we are compiling a variable font from designspace, most of the time the feature files are exactly the same; the only thing which differs is the "implicit" rules created from anchors and kerning. Currently we build each font independently, and this means we compile binary features N times (where N=number of masters), and then merge them at the end. When there are overflows, compiling binary features N times is slow. Instead, this PR holds off the compilation of GSUB/GPOS tables to the end, and uses "variable" feature writers which describe how the kerning and anchors vary across the designspace. i.e. instead of:

### MutatorMathTest-LightCondensed ###
lookup kern_ltr {
    lookupflag IgnoreMarks;
    pos A J -10;
} kern_ltr;

### MutatorMathTest-BoldCondensed ###
lookup kern_ltr {
    lookupflag IgnoreMarks;
    pos A J -20;
} kern_ltr;

and compiles the features twice, it produces

lookup kern_ltr {
    lookupflag IgnoreMarks;
    pos A J (wght=0:-10 wght=1000:-20);
} kern_ltr;

and compiles the features once.

It's a lot faster.

[x] Curs feature writer is untested.
[x] There's some problem with enumerated kerns.
[x] I'm not sure I trust it yet.

madig commented 2 years ago

So to trigger this, a font project must make sure to have the same exact feature file text in every source, meaning that you have to use variable feature syntax if there are differences in some parts of the Designspace?

simoncozens commented 2 years ago

Yes, it doesn’t attempt to do a text/AST-based merge of the feature files. At least not at this stage - maybe that can come later. I’m fairly sure that this will cover a good number of font projects, and those that it doesn’t get cover will get the old style merge.

madig commented 2 years ago

In one of our fonts, this seems to drop some kerning pairs but not others (can't check if this is just the kern splitting because the font doesn't compile without this PR) 🤔 weird, drilling some more.

Also, the Rules writer is not hooked up and justr dropping it into the default feature writer list doesn't seem to do anything.

madig commented 2 years ago

So, I get rules to work when I graft the GSUB <FeatureVariations> from a previous ufo2ft and adapt the lookup numbers. This means that the rules writer should do whatever varLib is doing I guess.

Notes:

The writer outputs values in design space. There seems to be no official spec for this so I'll need to check feaLib
The writer outputs integer locations, where Designspaces can have floating point ones.

simoncozens commented 2 years ago

I disabled the rules writer because varLib also handles designspace rules during the merge and I was ending up with them being emitted twice.

madig commented 2 years ago

Weird. For my test font, no GSUB variations are emitted without the rules writer.

madig commented 2 years ago

Now we do approximately what varLib is doing with rules, though compared to a reference, the numbers and lookups are off in some places. Maybe I use overlayFeatureVariations wrong. At least with inspection by eye, stuff seems to work as expected.

There still remains a difference in kern behavior I need to investigate.

madig commented 2 years ago

Also todo: The test should include disjoint kerning pairs for different scripts, to ensure we handle disjoint kerning in different masters correctly (i.e. Armenian only in one, Greek only in the other).

madig commented 2 years ago

I disabled the rules writer because varLib also handles designspace rules during the merge and I was ending up with them being emitted twice.

Oh, I know what's going on. My test sources happen to trigger the compile-features-once-but-later code path, so varLib adds feature variations and then we overwrite them when running the feature writers. This also seems to affect kerning.

madig commented 2 years ago

Next, my test source has e.g. an r vs d class-based kern that gets lost when compiling features later for VFs. 🤔

simoncozens commented 2 years ago

You're using non-public test sources, which makes it difficult for other people to repro... I'll see what I can find.

simoncozens commented 2 years ago

Note to self: _featuresCompatible will return false if there are any ligature caret positions, which glyphsLib writes into the feature file (with per-master-file variations). Need to think how to handle these.

madig commented 2 years ago

Woops, yeah. I'll be back at work next week and can see if I can make a reproducer. For now, maybe just compile the font with identical features with 2.28.0 and then with this branch and compare kerning.

I thought I turned ligature writing off? Should be in https://github.com/googlefonts/glyphsLib/releases/tag/v6.0.7

anthrotype commented 2 years ago

yeah, the GdefFeatureWriter in ufo2ft should handle both the GlyphClassDefs (using public.openTypeCategories in the ufo lib) and the LigatureCarets (from the glyph anchors named "caret", "vcaret")

madig commented 2 years ago

So, I tried to come up with a minimal reproducer, but so far, no luck. What did seem to work was to make a union of all kerning values over all sources, putting in missing ones as zero-kerns. The specific pair I was looking at was actually "missing" such a zero-kern in one corner of the design space. Maybe that's what's missing?

madig commented 2 years ago

varLib allows "sparse kerning", i.e. if Light and bold have a value but Regular doesn't, the value at Regular is interpolated rather than set to zero. So, filling in zero for missing kern pairs seems like the wrong thing to do.

madig commented 2 years ago

In my testing with a proprietary font, it seems the kerning is there, but it is not used, as if the shaper was stopping earlier. Subsetting the font to contain just the two glyphs I want to look at makes the kern work as expected. Will try to reproduce with an OSS font.

madig commented 2 years ago

Found an open font that repros: Roboto Serif with "py".

simoncozens commented 2 years ago

Here is a very minimal reproducer: RobotoSerif-Mini.glyphs.gz

madig commented 2 years ago

(The last commit should be using sets instead of lists for members in side(1|2)Groups but I don't want to anger the determinism gods)

simoncozens commented 2 years ago

With your patch and https://github.com/fonttools/fonttools/pull/2776, I think this is good now.

It has a slight change of behaviour compared to the past in respect of "empty" kern pairs, but I think this is for the better. If an intermediate master previously did not define a kern value for a pair which had kerning in other masters, we would assume that value was zero. For example, if you had Regular=-100, Bold=-200 and no kern defined in the SemiBold, you would get a kerning value of zero. Presumably designers would have realised that they didn't mean that, and put an explicit kern in there.

~~But now we can do better than that: if there is no explicit kern for an intermediate master, it will be interpolated. This should make people's kerning processes a lot faster...~~

simoncozens commented 2 years ago

The remaining test failures are due to the new behaviour described above.

simoncozens commented 2 years ago

OK, so one more bug we found: when backfilling kerns for missing masters, we shouldn't just use zero. varLib.merge also looks at more general rules in other masters and backfills using their value. e.g. if you have

enum pos @kern.1.o w -200;
pos @kern1.o @kern2.w -30;

in the regular master and

pos @kern1.o @kern2.w -20;

in the bold, you don't want to do this:

enum pos @kern.1.o w (wght=400:-200 wght=800:0); # No equivalent rule, use zero
pos @kern1.o @kern2.w (wght=400:-30 wght=800:-20);

because that ends up stealing the kern altogether. instead you want to do this:

enum pos @kern.1.o w (wght=100:-200 wght=400:-20); # A more general rule covers this pair, use it
pos @kern1.o @kern2.w (wght=400:-30 wght=800:-20);

madig commented 2 years ago

Talking to Jany, Harry and Cosimo, we need to figure out all cases of this and what should happen in each. 🤔 Maybe it is even not possible to keep the split-kerns-by-script code and make it work with the existing merge flow? We'd need to figure out what to do if one master does not have kerning for something, just like here, because otherwise varLib complains about https://github.com/googlefonts/ufo2ft/issues/642.

simoncozens commented 2 years ago

I don’t think there’s a mystery about what needs to happen.

If there is no exact equivalent kern pair in a master, the code should look to see if there is a kern provided by another class based rule, and if so backfill using that. If not it should use zero. This is exactly what varLib does in https://github.com/fonttools/fonttools/blob/main/Lib/fontTools/varLib/merger.py#L347-L355

madig commented 2 years ago

I think I am beginning to understand. This WIP commit seems to make basic text mostly match ufo2ft 2.28.0-compiled Roboto Serif, except for the "ft" pair, which seems to have a positive kern in the 2.28.0 version.

madig commented 2 years ago

Actually, can we reuse fontTools.ufoLib.kerning.lookupKerningValue? I see some other kerning diffs in Roboto Serif like in "fi", "f," and "øj", though.

madig commented 2 years ago

With the latest commits, I can only see kerning diffs for the following two pairs on my testing page on the default location of Roboto Serif:

øj
øy

The first is a glyph-to-class kern pair, the second a glyph-to-class exception pair.

madig commented 2 years ago

One thing to do would be to think about sparse kerning (SparseKern.zip) -- The Regular does not have a kerning value for AB. varLib will ensure there is no kerning value for it and the value will be interpolated, the latest commits here will insert a zero, making the kerning interpolate to zero in Regular. Sparse kerning should be supported.

simoncozens commented 2 years ago

I see a 20 unit kern for øj and øy with this PR. However, it looks like we have managed to make the build considerably slower.

~A large number of Already defined position for pair... messages too.~ (these are also in the original)

madig commented 1 year ago

I noticed that the code doesn't seem to take into consideration sparse layers (Designspace sources with a layerName). Maybe it should.

madig commented 1 year ago

Rebased on main, now to fix the tests... currently I don't know what's going on with them.

madig commented 1 year ago

Note to self: if the sources are kerned such that we have e.g. pos @kern1.hyphen @kern2.Cyrl.De (wght=100:-20 wght=900:20);, is it reasonable to interpolate the value at wght=400 to appease the variation model instead of inserting a zero? I believe so. Still, find out what varLib does.

madig commented 1 year ago

Looking at a flood of warnings about regrouped glyphs, but the code makes me think they're bogus... Still trying to figure them out.

madig commented 1 year ago

FWIW, this produces kerning different from the merge path. I think I have to work on kerning-validator to 1. support comparing kerning between two fonts and 2. replicate the behaviour of varlib in determining the kerning value of a pair in a design space.

madig commented 1 year ago

I'm beginning to suspect that there is something else going on during compilation. I have this big font here that has seemingly identical kerning when I compile the wght-only variant, but has "missing kerning" when I compile the bigger wght+wdth+ital variant. The kern I looked at had an entry in the debug feature file, but the wght+wdth+ital TTF had no kerning for it at some locations, but did at others. Need to find an open font I can use for repros.

madig commented 1 year ago

I think I figured out the problem, at least for Roboto Serif. Compiling it shows missing kerning for the text "g,", but only at some locations and not others.

I subset the Italic Designspace (the uprights don't compile) to have just space, g and comma (using https://github.com/googlefonts/gftools/blob/main/Lib/gftools/ufomerge.py to merge it into an empty font, i.e. subset it this way) and then compiled a VF. The debug feature file showed this (edited for brevity, GRAD=0,opsz=20,wdth=100,wght=400 is the default location, the other one some extreme end):

lookup kern_Latn {
    lookupflag IgnoreMarks;
    enum pos @kern1.Latn.g.alt comma (
        # ...
        GRAD=0,opsz=20,wdth=100,wght=400:0
        # ...
        GRAD=100,opsz=8,wdth=150,wght=900:-30
    );
    pos @kern1.Latn.g.alt @kern2.Latn.period (
        # ...
        GRAD=0,opsz=20,wdth=100,wght=400:-20
        # ...
        GRAD=100,opsz=8,wdth=150,wght=900:-40
        # ...
    );
} kern_Latn;

In the sources, the default location has a kerning pair

pos @kern1.Latn.g.alt @kern2.Latn.period -20

and the extreme end at GRAD=100,opsz=8,wdth=150,wght=900 has the kerning

enum pos @kern1.Latn.g.alt comma -30
pos @kern1.Latn.g.alt @kern2.Latn.period -40

Since the kerning exception is only present in a few sources, but not the default one, the current code inserts a zero. When shaping "g,", the shaper grabs the first pair that matches, sees a zero, and stops. If it looked at the default location source in isolation, it would not find the exception and go on to the class to class kerning, resulting in the expected -20.

So... what to do about it. I think we need to handle exceptions specially by backfilling not a zero or nothing if it does not exist in all locations but the value from "higher up". In this case, the class-to-glyph pair would carry all values of the class-to-class value but overwrite some with its own?

I think the lookupKerningValue from ufoLib would do that, but I used it before and it had the same problem, so maybe I held it wrong.

Maybe I could do an extra pass after having collected all kerning from all sources, to find kerning exceptions and backfill missing locations (i.e. all class-to-glyph and glyph-to-class kernings would inherit values from the class-to-class parent, then all glyph-to-glyph pairs would inherit from their class-to-glyph and glyph-to-class parents) 🤔

madig commented 1 year ago

I believe this script does what needs to be done: https://gist.github.com/madig/76567a9650de639bbff51ce010783790, both for variable feature compilation and the old merge-to-VF flow, and instance interpolation, unless I'm missing something @Hoolean. If so, I'd need to work this into kerning-validator as a model.

Hoolean commented 1 year ago

I believe this script does what needs to be done: https://gist.github.com/madig/76567a9650de639bbff51ce010783790, both for variable feature compilation and the old merge-to-VF flow, and instance interpolation, unless I'm missing something @Hoolean. If so, I'd need to work this into kerning-validator as a model.

The script's union kerning step largely maintains fontmake's variation model but we need to cover an additional case:

✅ For sources with kerning and/or GPOS features, the current behaviour is correct
:x: For other sources, the current behaviour will insert 0-value kerns at every pair, where the values should be interpolated at this source's position in the final font instead

If our goal is to make every kerning value explicit at the source-level, the latter case may be difficult to handle without re-implementing feature interpolation.

EDIT: We can treat kerning interpolation outside of sparse layers as a bug moving forwards, see comment directly below.

Hoolean commented 1 year ago

My above comment can be largely discounted after we revisited googlefonts/fontmake#998. Having discussed with @anthrotype, we only need to maintain interpolated kerning for sparse layers, and can drop the edge case of interpolated kerning for UFO sources that have an empty or absent kerning.plist and GPOS features:

The behaviour likely originates from a bug with how we invoke varLib;
The scenario is highly-specific and awkward to achieve (most sources have at least some GPOS features, e.g. even if implicitly from anchors);
Automatic testing of a corpus of fonts found no sources using this behaviour; and
Interpolation can be easily replicated by inserting explicit values, which is used in all other scenarios.

Sparse layers only contribute glyph outlines, thus neither contribute kerning nor participate in kerning interpolation; any kerning at their location will be interpolated at runtime from other kerning-contributing sources. All other sources have a kerning dictionary associated with them, and should participate in the kerning interpolation even if their kerning.plist is absent or empty; they should implicitly contribute zeros for the union of kerning pairs across sources.

This simplifies what we expect for kerning in the final font to:

Sparse layers: fully interpolated
All other layers: no interpolation

Hoolean commented 1 year ago

Back on the code! The last commit I pushed adopts parts of @madig's script, inserting values for the union of all pairs at every source location.

Before this commit we did not insert values at source locations where:

The source did not have an exactly matching entry (same specificity) for the kerning pair; and/or
A class-class pair had a value of 0.

As OpenType's variation model treats the absence of pairs differently to the DS+UFO kerning model, this meant that such pairs respectively:

Had interpolated values where they should have fallen back to the value of more general kerning entries; and/or
Had interpolated values where they should have had a value of 0.

This commit corrects this, shrinking the difference between our merge and variable-first feature compilation pipelines. There are now no obvious visual diffs between merge-based and variable-first kerning in Roboto Serif 🚀

Next steps

[ ] Roboto Serif still has a small file-size diff; we should see if we can grok it in TTX form
[ ] Update kerning-validator to test the kerning at all interesting locations in a VF, to validate our new output
[ ] Extract glyph/group 'backfill' code to an independent function, for use in other places
- e.g. fontmake also needs to handle implicit values for instantiation, so we could reuse our backfill code there to ensure the approaches are consistent
[ ] Consider which parts of the group alignment in Nikolaus' script we wish to include
[ ] Add tests and tidy

m4rc1e commented 1 year ago

I've just hit the dreaded https://github.com/googlefonts/fontmake/issues/894 issue with Merriweather VF so I don't mind getting through @Hoolean's outstanding tasks. This solution does seem better than the previous attempts so I don't mind investing some time into this.

Hoolean commented 1 year ago

@m4rc1e Hi Marc, happy to chat in more depth if this would be helpful for that :)

This PR – when completed – should indeed be able to avoid that fontMake issue, although it would depend on the feature files being identical across Merriweather's sources, probably relying on using variable syntax in some places :+1:

simoncozens commented 1 year ago

I actually need this functionality for a project now, so I've rebased this PR. However, I still see a few problems:

Mark anchors positioning (_top etc.) is not varied. (This is due to GlyphsLib having its own mark feature writer, which of course is currently not variable!)
~Something is wrong with cursive positioning. (I'll also try to fix this.)~ (Fixed
~Integration tests are failing and giving significantly fewer variation regions for me locally, although I notice they're succeeding on the CI?!~ This was due to me using a new version of fontTools. Integration tests will break in the future when we upgrade the dependencies, but that's not the fault of this PR!

anthrotype commented 1 year ago

maybe you and CI are using different fonttoolses?

anthrotype commented 1 year ago

CI is pinned at https://github.com/googlefonts/ufo2ft/blob/efb4658890ba645292a9c6bc055c91afe225c4a1/requirements.txt#L1

simoncozens commented 1 year ago

OK, those issues are fixed, but I found a new one: we've lost the public.openTypeCategories information and now marks are missing from my GDEF table.

anthrotype commented 1 year ago

we've lost the public.openTypeCategories information

is that stored at the single-master level? we should probably try fetch it from the base master or preferably from the DS.lib because that should really be a global kind of data that applies to all masters

simoncozens commented 1 year ago

Yeah, we can try both. OK, this is working for my purposes now. I'm not sure what the state of the kern debate is.

simoncozens commented 1 year ago

Something weird happening with (ligature) mark positioning, although the numbers look right for me in the variable feature file...

googlefonts / ufo2ft

Compile variable features #635

Next steps