ldc-developers / ldc

The LLVM-based D Compiler.
http://wiki.dlang.org/LDC
Other
1.2k stars 260 forks source link

Regression: 1.29 fails to compile gir-to-d while 1.28 succeeded - undefined reference to `_D6object__T8opEqualsTxC3gtd11GirFunctionQnTxQwZQBkFxQBexQBiZb' #4000

Open ximion opened 2 years ago

ximion commented 2 years ago

Hi!

After a bunch of unsuccessful attempts at trying to solve this in gir-to-d and finding no issue, I am pretty sure that this is some kind of compiler issue (maybe in the frontend even, but I unfortunately don't have the time at the moment to figure that out, which is also the reason for not providing a smaller testcase - sorry for that!). To reproduce the issue, try to compile gir-to-d with LDC 1.29.x:

git clone https://github.com/gtkd-developers/gir-to-d.git
cd gir-to-d
mkdir b && cd b
DC=/path/to/ldc-1.29/ldc2 meson .. && ninja

This results in a linker error while 1.28.x built this successfully:

ldc2  -of=girtod girtod.p/source_girtod.d.o girtod.p/source_gtd_DefReader.d.o girtod.p/source_gtd_GlibTypes.d.o girtod.p/source_gtd_GirAlias.d.o girtod.p/source_gtd_GirConstant.d.o girtod.p/source_gtd_GirEnum.d.o girtod.p/source_gtd_GirField.d.o girtod.p/source_gtd_GirFunction.d.o girtod.p/source_gtd_GirPackage.d.o girtod.p/source_gtd_GirStruct.d.o girtod.p/source_gtd_GirType.d.o girtod.p/source_gtd_GirVersion.d.o girtod.p/source_gtd_GirWrapper.d.o girtod.p/source_gtd_IndentedStringBuilder.d.o girtod.p/source_gtd_Log.d.o girtod.p/source_gtd_LinkedHasMap.d.o girtod.p/source_gtd_WrapException.d.o girtod.p/source_gtd_XMLReader.d.o -L=--allow-shlib-undefined -link-defaultlib-shared
/usr/bin/ld: girtod.p/source_gtd_GirPackage.d.o: in function `_D3gtd12LinkedHasMap__T13LinkedHashMapTAyaTCQBq11GirFunctionQnZQBo4Node11__xopEqualsMxFKxSQDkQDj__TQCyTQCmTQCmZQDkQBwZb':
/tmp/gir-to-d/b/../source/gtd/LinkedHasMap.d:27: undefined reference to `_D6object__T8opEqualsTxC3gtd11GirFunctionQnTxQwZQBkFxQBexQBiZb'
collect2: error: ld returned 1 exit status
Error: /usr/bin/cc failed with status: 1
ninja: build stopped: subcommand failed.

That function declaration exists though, but somehow it is stripped from the generated object files. This was tested on Debian 12 and first reported here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1011739 (and kind of now caused the removal of all LDC-dependent D code from Debian, which is a bit annoying (but not critical, we can get everything back for the final release :-) ))

kinke commented 2 years ago

Have you tried a dub build instead of meson, seeing that the repo has a regular dub.json? Dub's all-at-once builds (per dub package) are waaay less likely to hit all sorts of missing symbol errors with meson per-module compilations, caused by attribute inference or template emission issues in the frontend. And should yield a faster optimized binary for executable targetType, due to implicit cross-module optimizations across the whole project (compiled to single object file).

ximion commented 2 years ago

It looks like it builds with dub, but we can not use dub for building distribution packages because it integrates terribly with distro packaging (and not at all with mixed-code projects or projects that need to install artifacts to system locations). Meson has a unity-build feature, but that's not implemented for the D backend (and splitbuilds should at least build...). Btw, this build was done without any optimizations or extra flags, the build command-line is e.g.: ldc2 -I=girtod.p -I=. -I=.. -I=../source -enable-color -wi -g -d-debug -J/tmp/gir-to-d/b -makedeps=girtod.p/source_gtd_LinkedHasMap.d.o.deps -of=girtod.p/source_gtd_LinkedHasMap.d.o -c ../source/gtd/LinkedHasMap.d (same issue happens on optimized builds)

JohanEngelen commented 2 years ago

It would help a lot if you can dustmite this to a reasonable size. The way to do it: make a shell script that reproduces the issue (several LDC invokes for the individual files, and then a final invoke to link the executable). The idea is to make that script succeed if the issue is reproduced, and fail if it is not. You can use grep to check that the error message is there during linking. Then use dustmite to minimize it automatically (may take a long time, several hours or days). https://github.com/CyberShadow/DustMite/wiki#tutorial

ximion commented 2 years ago

Sorry, I am really drowning in work so I only got to creating a minimized test case now. Please find the dustmite result below. To compile, make sure meson is installed (via your package manager, any Meson version should work fine here) and run DC=ldc2 quickbuild.sh from the source directory (or just build the thing as you would normally compile with meson). I hope this helps :-) gtd-issue4000.reduced.zip

kalev commented 2 years ago

I've been holding back LDC updates in Fedora because of this issue, for what it's worth.

kinke commented 2 years ago

Well fighting with missing symbols is unfortunately a constant pain with D, especially with separate compilation. Most of the time, -allinst now suffices to fix those. These are (old and tricky) frontend bugs, not LDC specific; in fact, I see much more missing symbols with DMD than LDC (probably caused by different TypeInfo emission strategies). So I'm afraid this is very likely not going to be fixed anytime soon.

kalev commented 2 years ago

Oh, indeed, that nicely works around the build issue. Thanks, @kinke!

ximion commented 2 years ago

Last time a bug like this appeared, it was actually fixed in reasonably-ish time, and fortunately we don't seem to run into them that frequently. This bug however did cause the removal of pretty much all D code that doesn't use GDC from Debian, so if -allinst works as workaround, that would help a lot :-)

kinke commented 2 years ago

FWIW, I can confirm that this happens with DMD v2.100.1 too, and also with dub build --build-mode=singleFile, but dub build (compiling all modules in one compiler invocation) works. - I've very quickly looked for a similar meson option and found their 'unity' build option (meson --unity on) - but that's apparently not supported for D yet (tested v0.61.2). :( - Oh, mentioned above already.

If you guys cannot use dub, then reggae (edit: reggae && ninja - works fine by default too) is probably no option either, because it depends on dub and uses it as a library, incl. using/populating the dub packages cache (edit: but only for the sources, no binary artifacts) if that's what's blocking you. But their dlang support is obviously miles ahead of meson.

kalev commented 2 years ago

OK, ldc 1.30 update is done for Fedora 37 and all dependent packages rebuilt. Thanks for the help!

https://bodhi.fedoraproject.org/updates/FEDORA-2022-d2d6bccfaf

ximion commented 2 years ago

Looks like GDC 12.x suffers from the exact same issue - which does make sense with the original problem being in the frontend. Passing -fall-instantiations to GDC also works around the problem, just like with LDC.

ximion commented 2 years ago

Ooof, something is seriously wrong with the DMD frontend... Many packages using GDC just starting to fail, including Dustmite which is compiled in one go:

gdc -odustmite \
        -g -O2 -fstack-protector-strong \
        -Wl,-z,relro \
        -Wall \
        *.d
/usr/bin/ld: /tmp/ccrtJBRD.o: in function `_D3std4math9algebraic__T8nextPow2TmZQmFNaNbNiNfxmZm':
/usr/lib/gcc/x86_64-linux-gnu/12/include/d/std/math/algebraic.d:721: undefined reference to `_D3std4math9algebraic__T15powIntegralImplVEQBpQBoQBm7PowTypei1TxmZQBqFNaNbNiNfxmZxm'

Passing -fall-instantiations to GDC fixes this issue as well. I bet that LDC is also affected by this. @ibuclaw @kinke I guess these issues should be taken upstream somehow, but is there any chance that we could include some of these things in testing so bugs like this hit stable releases less often? Or do you think that all of this is a lost battle and can't be fixed? (I sure hope not...)

kinke commented 2 years ago

I can't say anything about GDC (other than the jump from v2.076 to v2.100 obviously being huge), but we haven't had any problems with dustmite bundled with LDC (FWIW, from the dlang/tools repo, not upstream dustmite). And missing symbols fortunately aren't that common with all-at-once or (reggae's) per-package builds here at work.

The dlang/tools version of dustmite is already CI-tested upstream for all relevant repos (DMD + druntime, Phobos) via Buildkite (e.g., https://buildkite.com/dlang/dmd/builds/27674#018267d8-3e58-495f-8324-85c02ee5a83f). Adding a project to be tested there is quite simple: https://github.com/dlang/ci

ibuclaw commented 2 years ago

I can't say anything about GDC (other than the jump from v2.076 to v2.100 obviously being huge), but we haven't had any problems with dustmite bundled with LDC (FWIW, from the dlang/tools repo, not upstream dustmite). And missing symbols fortunately aren't that common with all-at-once or (reggae's) per-package builds here at work.

Looks like a bug in the front-end to me, possibly from your aggressive template pruning changes.

powIntegralImpl is marked pragma(inline), so should never be pruned by the front-end.

kinke commented 2 years ago

powIntegralImpl is marked pragma(inline), so should never be pruned by the front-end.

The culling doesn't and AFAIK has never taken that into account; it's the glue layer that needs to handle it (for LDC, we need to emit those into every referencing 'object file', with extra annotations).

ibuclaw commented 2 years ago

powIntegralImpl is marked pragma(inline), so should never be pruned by the front-end.

The culling doesn't and AFAIK has never taken that into account; it's the glue layer that needs to handle it (for LDC, we need to emit those into every referencing 'object file', with extra annotations).

On a little more inspection, it looks like the reverse has happened instead.

In 2.098.x, nextPow2 was culled, since 2.099.x nextPow2 is now emitted, but it's dependency powIntegralImpl isn't.

https://d.godbolt.org/z/539TqMz9j https://d.godbolt.org/z/1PW9cab5Y

The only other user of nextPow2 is std.uni, but import that instead of std.regex and you don't see the culling behaviour.

kinke commented 2 years ago

but it's dependency powIntegralImpl isn't.

Clearly is for LDC, it's inlined into nextPow2 (even at -O0), but doesn't make it to the object file itself.

arrowd commented 1 year ago

I just bumped into this issue with LDC 1.30 and gir-to-d on FreeBSD. What's the workaround for this?