adobe-type-tools / afdko

Adobe Font Development Kit for OpenType
https://adobe-type-tools.github.io/afdko/
Other
1.06k stars 167 forks source link

[tx] glyphs rendered by -pdf (and -ps) are incorrect #1760

Open wezm opened 2 weeks ago

wezm commented 2 weeks ago

This started happening in the 4.0 release I think. It was previously fine, now the outlines of the glyphs in many fonts are incorrect. E.g.

tx -pdf tests/data/fonts/sourcecodepro/SourceCodePro-Regular.ttf > example.pdf
Screenshot from 2024-11-12 10-13-16

Info

Host: Arch Linux x86_64 afdko: 4.0.2 built from source via AUR

$ tx -v
Versions:
    tx        1.3.0
    absfont   1.0.54
    ctutil    2.0.3
    dynarr    2.0.4
    sfntread  1.0.7
    t2cstr    1.0.23
    cffread   2.1.3
    cffwrite  1.0.56
    sfntwrite 1.0.6
    pstoken   2.0.11
    t1cstr    1.0.21
    t1read    1.0.45
    ttread    1.0.22
    cfembed   2.0.25
    pdfwrite  1.0.7
    svread    1.0.8
    t1write   1.0.35
    svwrite   1.1.11
    uforead   1.3.1
    ufowrite  1.1.0
    varread   1.0.8

example.pdf

skef commented 2 weeks ago

I suspect this is something I've seen before and relates to compiler optimization. If you're in a position to you might be able to confirm this by building e.g. tx and trying that out. If you have the build tools installed tis might work (from the top directory of the sources in the develop branch of this repo):

cmake -B build -GNinja -DCMAKE_BUILD_TYPE=Debug
cd build
ninja

The compiled executables would then be in build/bin.

Unfortunately when I saw this I looked at the code and didn't find any self-evident undefined behavior so it's not clear what the fix would be.

wezm commented 2 weeks ago

You're right, a debug build generates the right output:

Screenshot from 2024-11-12 10-43-06

Interestingly a Release build built with clang (18.1.8) is also fine. Release build built with gcc (14.2.1) and -fsanitize=undefined yields no sanitizer errors but produces the bad output.

Additionally building with gcc (13.3.0) the output is right in a release build... maybe it's a gcc issue?

CC=gcc-13 CXX=g++-13 cmake -B build -GNinja -DCMAKE_BUILD_TYPE=Release
skef commented 2 weeks ago

Well, on the one hand blaming one's compiler is almost always a recipe for getting egg on one's face. But on the other hand it sure seems like almost everything is pointing in that direction. The test suite doesn't pass when this problem gets compiled in so we have a pretty good idea of when it happens, and so far I've only seen it on Arch (and therefore on the bleeding edge of gcc releases).

When I first saw this and looked into the code I tentatively decided it was probably a compiler problem (because there really didn't seem to be anything in the ballpark of UB) that would quickly go away because someone would run into some form of it in a more flashy context and it would quickly get fixed. But it's been months now, so I guess we need to do something about it. I guess try to make a small repeatable case.

Still, very weird to run into this in this way. It makes me wonder what I'm missing.

skef commented 2 weeks ago

Would you be willing to add a note to AUR pointing to this issue and suggesting a different build strategy for the time being? (Presumably either debug or a different compiler ...)

skef commented 2 weeks ago

Might be worth seeing if -fno-strict-aliasing eliminates the problem.

wezm commented 2 weeks ago

Would you be willing to add a note to AUR pointing to this issue and suggesting a different build strategy for the time being? (Presumably either debug or a different compiler ...)

No problem, done: https://aur.archlinux.org/packages/afdko#comment-998167

wezm commented 2 weeks ago

Might be worth seeing if -fno-strict-aliasing eliminates the problem.

A build as follows still produced the incorrect output:

CFLAGS=-fno-strict-aliasing CXXFLAGS=-fno-strict-aliasing cmake -B build -GNinja -DCMAKE_BUILD_TYPE=Release
skef commented 2 weeks ago

Thank you for that note and the check! That's interesting -- so it's not the most obvious example from the gcc issue queue. I guess isolating a reproducible case is the way to go.

skef commented 2 weeks ago

Here's a small reproducible case. reprocase.zip

skef commented 2 weeks ago

New GCC bugzilla accounts are restricted right now so I'm waiting on an email request.

skef commented 2 weeks ago

See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117546 . Looks like the cause of the problem has already been fixed but the fix is still percolating outward.

wezm commented 2 weeks ago

Thanks for investigating!