googlefonts / fontc

Where in we pursue oxidizing (context: https://github.com/googlefonts/oxidize) fontmake.
Apache License 2.0
76 stars 13 forks source link

repeated builds are not deterministic #647

Closed anthrotype closed 8 months ago

anthrotype commented 9 months ago

ok the head.modified changes every time (and checksums would also change accoridngly), but I would not expect other things to differ when I compare ttx dumps of font files that were freshly built from the same source file using the same copy of fontc.

It turns out there are some differences in things like GDEF and GPOS (as noted in https://github.com/googlefonts/fontc/pull/644#discussion_r1425331084, this one possible regression introduced with #642), as well as in GSUB (in the FeatureParamsStylisticSet nameIDs).

Also when doing ttx -l to show the table directory, the gvar table seems to have different checksums each time round, even though the length is the same.. But diffing its ttx repesentations gives no differences, so it might be the bytes themselves are serialized differently each time, for some reasons.

anthrotype commented 9 months ago

forgot to mention - I was able to notice these diffs on repeated builds of GoogleSans.designspace, I didn't try other fonts. I could not detect them, for example, when building Oswald.glyphs

cmyr commented 9 months ago

In the past I have seen things like this arise when using collections that provide a non-deterministic iteration order (e.g. hashmaps) so that would be one thing I would look into first. I have done some work in the past in this regard (for instance #561) and it's very possible there are other places where we've missed this.

anthrotype commented 9 months ago

I think here it's also to do with the fact that we split some tasks in parallel threads that may complete at different times and then we gather their result in some other collection where the order does matter

cmyr commented 9 months ago

yes that is totally possible as well.

rsheeter commented 9 months ago

Good catch. Can we reproduce in a test with our test files? - that should make it easy to fix and keep fixed.

anthrotype commented 9 months ago

I don't think our test files contain enough kerning pairs to trigger this..

rsheeter commented 9 months ago

Let's fix this after we land https://github.com/googlefonts/fontc/pull/644. We can add a test file with lots of kerns or a knob to help it trigger with fewer kerns if necessary.

rsheeter commented 9 months ago

In IM @anthrotype noted "the GPOS vary in size between 106684 bytes and 4384362, an order of magnitude different.."; that's mildly alarming

rsheeter commented 9 months ago

Reproduction:

$ rm *.ttxl && for i in {0..9}; do rm -rf build/ && cargo run ../googlesans/source/GoogleSans/GoogleSans.designspace && ttx -l build/font.ttf > run$i.ttxl; done
$ grep -H GPOS run?.ttxl
run0.ttxl:    GPOS  0xE96ECCD8   4384230     18876
run1.ttxl:    GPOS  0x45C3186D   4384236     18880
run2.ttxl:    GPOS  0xF17589DD   4384230     18876
run3.ttxl:    GPOS  0xB16C1453    106684     15152
run4.ttxl:    GPOS  0xA721CEE8   4384236     18876
run5.ttxl:    GPOS  0x1E2BB2BF   4384230     18880
run6.ttxl:    GPOS  0xB8849B8C   4384236     18876
run7.ttxl:    GPOS  0xB16C1453    106684     15152
run8.ttxl:    GPOS  0x2B12F05E   4384194     18876
run9.ttxl:    GPOS  0x0E7DA5D9   4384188     18876
rsheeter commented 9 months ago

https://github.com/googlefonts/fontc/issues/647#issuecomment-1854748451 repro is fixed, new repro:

f=../googlesans/source/GoogleSans/GoogleSans.designspace; \
export SOURCE_DATE_EPOCH=$(date +%s); \
cargo build --release && rm -rf build/ \
&& target/release/fontc $f && cp build/font.ttf build/first.ttf \
&& target/release/fontc $f && cp build/font.ttf build/second.ttf \
&& diff -u <(fonttools ttx -o - -l build/first.ttf | tail -n +2) <(fonttools ttx -o - -l build/second.ttf | tail -n +2)

# GSUB and gvar report same length and offset but different contents

$ diff -u <(fonttools ttx -o - -t GSUB build/first.ttf | tail -n +2) <(fonttools ttx -o - -t GSUB build/font.ttf | tail -n +2)
# oh no, we're assigning inconsistent name ids, e.g.
-            <UINameID value="263"/>  <!-- Text Medium -->
+            <UINameID value="260"/>  <!-- Medium -->
rsheeter commented 9 months ago

The UINameID diff is fixed, table is now the same length, same offset, but binary different run to run. Steps to reproduce:

f=../googlesans/source/GoogleSans/GoogleSans.designspace; \
export SOURCE_DATE_EPOCH=$(date +%s); \
cargo build --release && rm -rf build/ \
&& target/release/fontc $f && cp build/font.ttf build/first.ttf \
&& target/release/fontc $f && cp build/font.ttf build/second.ttf \
&& diff -u <(fonttools ttx -o - -l build/first.ttf | tail -n +2) <(fonttools ttx -o - -l build/second.ttf | tail -n +2)

To dig further I have a fonttools hack. To view the code see https://github.com/fonttools/fonttools/compare/main...rsheeter:fonttools:main. To use it:

# Example assumes your currend directory is the root of fontc

# onetime setup
$ (cd .. && git clone git@github.com:rsheeter/fonttools.git fonttools_rs)
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -e ../fonttools_rs/
$ pip install absl-py

# Run the steps to reproduce (start of this comment). Should report GSUB mismatch, checksums differ, same length and offset
$ python ../fonttools_rs/Snippets/compare_table.py build/first.ttf build/second.ttf --noprint_eq
# should print a bunch of things with different offsets, abbreviated example follows
font["GSUB"].table.LookupList.Lookup[1383].SubTable[0].ExtSubTable.SubstLookupRecord[0].offset Value 86398 != 114938
font["GSUB"].table.LookupList.Lookup[1391].SubTable[1].ExtSubTable.SubstLookupRecord[0].offset Value 86838 != 98408
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[0].ChainSubRule[2].SubstLookupRecord[0].offset Value 94840 != 113000
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[0].ChainSubRule[3].SubstLookupRecord[0].offset Value 94854 != 113014
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[1].ChainSubRule[0].SubstLookupRecord[0].offset Value 94784 != 112944
font["GSUB"].table.LookupList.Lookup[1395].SubTable[0].ExtSubTable.ChainSubRuleSet[1].ChainSubRule[1].SubstLookupRecord[0].offset Value 94798 != 112958
font["GSUB"].table.LookupList.Lookup[1408].SubTable[0].ExtSubTable.ChainSubClassSet[1].ChainSubClassRule[22].SubstLookupRecord[0].offset Value 79246 != 87712
rsheeter commented 9 months ago

Passing to @cmyr to dig into packing

anthrotype commented 9 months ago

https://github.com/googlefonts/fontations/pull/738 fixed indeterminism in GSUB table, however the gvar table is still changing each time fontc is run on GS (same length, different bytes/checksum) see #656

anthrotype commented 9 months ago

to clarify my previous comment, only the gvar table bytes are different, its high-level content as dumped with ttx is the same at each re-run.

anthrotype commented 8 months ago

looks like GDEF and GPOS still are not deterministic... :(

https://github.com/googlefonts/fontc/actions/runs/7501084544/job/20421031213?pr=674#step:12:32

they appeared to have changed length (and thus checksum) between the first and second build run

anthrotype commented 8 months ago

oh dear.. I can't reproduce locally. No idea what is going on :(

anthrotype commented 8 months ago

baging my head, still can't repro, but I can say that main also fails with the same issue, not just after https://github.com/googlefonts/fontc/pull/674 so the clippy fixes are unrelated to the GS repeatable build fail.

I just re-run the CI jobs that had previously passed on main after #673 got merged and now they failed: https://github.com/googlefonts/fontc/actions/runs/7501018748/job/20428708156

anthrotype commented 8 months ago

This is the shell script I'm using to attempt to reproduce locally what the CI is doing, I tried on both my glinux workstation and on my macbook pro, both are using the latest rust toolchain 1.75.0, I did cargo update, even cargo clean, I still can't reproduce, this script always ends without error and the files are identical. WTH is going on?!

#!/bin/bash

# clean up
rm -rf first-roman.ttf second-roman.ttf first-italic.ttf second-italic.ttf
rm -rf build

echo "Installing fontc"
(cd fontc && cargo install --path .)
which fontc

echo "Building the first roman"
time fontc ../googlesans/source/GoogleSans/GoogleSans.designspace
mv build/font.ttf ./first-roman.ttf

echo "Building the first italic"
time fontc ../googlesans/source/GoogleSans/GoogleSans-Italic.designspace
mv build/font.ttf ./first-italic.ttf

echo "Building the second roman"
time fontc ../googlesans/source/GoogleSans/GoogleSans.designspace
mv build/font.ttf ./second-roman.ttf

echo "Building the second italic"
time fontc ../googlesans/source/GoogleSans/GoogleSans-Italic.designspace
mv build/font.ttf ./second-italic.ttf

echo "diff {first,second}-roman.ttf"
diff -u <(ttx -l first-roman.ttf | tail -n +2) <(ttx -l second-roman.ttf | tail -n +2)
cmp first-roman.ttf second-roman.ttf

if [ $? -ne 0 ]; then
  echo "error: non repeatable build :("
fi

echo "diff {first,second}-italic.ttf"
diff -u <(ttx -l first-italic.ttf | tail -n +2) <(ttx -l second-italic.ttf | tail -n +2)
cmp first-italic.ttf second-italic.ttf

if [ $? -ne 0 ]; then
  echo "error: non repeatable build :("
  exit 1
else
  echo "success! :)"
  exit 0
fi
anthrotype commented 8 months ago

i'm going crazy chasing this heisenbug 🤯

In #676 I added a shell script (so that I can try to run the same commands locally and on CI) that builds a font twice and diff the ttx -l and cmp the binary fonts but now I can't seem to reproduce the failure any more on the CI even (locally, I never managed to repro).

anthrotype commented 8 months ago

and to add insult to injury, my repeatable-builds.sh script is now failing with Oswald, complains the gvar bytes changed!

$ resources/scripts/repeatable-builds.sh ../OswaldFont/sources/Oswald.glyphs
$ which fontc
/usr/local/google/home/clupo/.cargo/bin/fontc
$ fontc ../OswaldFont/sources/Oswald.glyphs -o build/Oswald.ttf
$ fontc ../OswaldFont/sources/Oswald.glyphs -o build/Oswald#1.ttf
$ ttcmp build/Oswald.ttf build/Oswald#1.ttf
build/Oswald.ttf build/Oswald#1.ttf differ: byte 177, line 3
--- /dev/fd/63  2024-01-12 16:03:31.330910933 +0000
+++ /dev/fd/62  2024-01-12 16:03:31.330910933 +0000
@@ -10,7 +10,7 @@
     cmap  0x8D57590D      6212     38860
     fvar  0x8D346992        84     45072
     glyf  0xDEF86C47     43674     45156
-    gvar  0x50120D8C     71016     88832
+    gvar  0xD0128D8C     71016     88832
     head  0x1ACDE23F        54    159848
     hhea  0x08AE0978        36    159904
     hmtx  0xFFDF6CD1      3532    159940
Error: build/Oswald.ttf and build/Oswald#1.ttf are different

But only sometimes.. 😢

$ resources/scripts/repeatable-builds.sh ../OswaldFont/sources/Oswald.glyphs
Success: build/Oswald.ttf and build/Oswald#1.ttf are identical
rsheeter commented 8 months ago

I can reproduce GS build instability locally as follows:

rm -f *.ttxl threads*.svg && for i in {0..4}; do rm -rf build/ && cargo run ../googlesans/source/GoogleSans/GoogleSans.designspace --emit-timing && cp build/threads.svg ./threads_$i.svg && ttx -l build/font.ttf > run$i.ttxl && grep GPOS *.ttxl; done

Looking at the threads_#.svg for a failure (small GPOS) shows we are running gather BE kerning before we generate it:

0ms (0.00%) Be(GatherBeKerning) run at 7527ms done at 7528ms

195ms (1.81%) Be(KernFragment(0)) run at 7690ms done at 7886ms
...etc...
anthrotype commented 8 months ago

I managed to isolate the gvar indeterminism, it's the shared tuples array that may end up being sorted differently each time because we are using a HashMap to count how many times a given tuple is shared among glyph tuple variations, and then use sort_unstable_by_key to take the max, but the sorting method does not preserve the order of equal elements (and HashMap also is inherently unsorted so iteration order isn't guaranteed either). If I use IndexMap::sort_by_key the order is stable and builds are reproducible again. I'll send a PR shortly.