Closed trimpim closed 7 months ago
I'm no expert on linking either, yet, since __gcc_personality_v0
is part of ld.lib.so shouldn't it suffice to add the whole-archive thing here:
https://github.com/genodelabs/goa/blob/2ec360cad0930998a03541d62e1c0a24a1559f0c/share/goa/lib/flags.tcl#L92
@ssumpf What's your take on this?
Each time the silver bullet whole archive is mentioned it feels worse to swallow that pill because I did not get the rationale yet.
@jschlatow f256f0124d2e761c53dfe0a7970f95b5b0f89a76 also fixes the problem for me.
This hopefully will not break #60. The libraries I get are < 2MB.
Each time the silver bullet whole archive is mentioned it feels worse to swallow that pill because I did not get the rationale yet.
The rational is everything that is .global hidden
in a library will get .local
after linking, and thus, inaccessible for the dynamic linker. When one, for example, links libgcc.a
without whole archive against a shared library, the static linker will find libgcc
symbols used by the shared library and create jump slot relocations for these symbols. Hence when the library actually calls these symbols the dynamic linker will not find them.
@trimpim, @jschlatow: I will have a look into this. @trimpim: Can I reproduce your scenario using https://github.com/trimpim/wasmedge-genode ? There doesn't seem to be a pkg
.
@ssumpf I just have pushed the branch test-231110
. This contains the pkg
and a small README.md
. With this you should be able to reproduce it.
To simplify your live, I suggest you also take the fix form #67 if you are using make 4.4.
@ssumpf I just have pushed the branch
test-231110
. This contains thepkg
and a smallREADME.md
. With this you should be able to reproduce it.To simplify your live, I suggest you also take the fix form #67 if you are using make 4.4.
@trimpim: Thanks, where do I find the llvm
API that's set in used_apis
?
@trimpim: Nevermind found it.
@trimpim: You need to change the 08-use_gcc_eh.patch
of wasmedge
from
target_link_libraries(wasmedge_shared
PRIVATE
wasmedgeCAPI
+ # https://forums.developer.nvidia.com/t/undefined-reference-to-gcc-personality-v0/131127/3
+ gcc_eh
)
to
target_link_libraries(wasmedge_shared
PRIVATE
wasmedgeCAPI
+ # https://forums.developer.nvidia.com/t/undefined-reference-to-gcc-personality-v0/131127/3
+ -Wl,--whole-archive -Wl,-lgcc_eh -Wl,--no-whole-archive
)
because __gcc_personality_v0
is part of libgcc_eh.a
and not our dynamic linker which provides __gxx_personality_v0
only. You need to do the whole archive
thing because the symbol is global .hidden
, as described for libgcc
above.
In the meantime I will try to clean up the libgcc
and whole-archive
chaos a little and look that things do not break for you.
@jschlatow: The issue for wasmedge
can be fixed by adjusting a patch in the project. Otherwise, I tried to clean up the libgcc.a
and whole-archive
problem with commit feea13c.
because
__gcc_personality_v0
is part oflibgcc_eh.a
and not our dynamic linker which provides__gxx_personality_v0
only. You need to do thewhole archive
thing because the symbol isglobal .hidden
, as described forlibgcc
above.
As we use libgcc_eh (and libsupc++) in a creative way in cxx.mk it is literally part of the linker but suffers from the global-hidden condition too (if I understand correctly). Would it make sense to review symbols/ld and the linking of ld.lib.so to provide more symbols of the runtime that are currently missing to solve the issues we address here?
@ssumpf thanks for the fix.
With this change and your patch build and run of wasmedge works for me.
@jschlatow: ce9b91b tries to improve upon f219ab3 by removing the unnecessary detection of libgcc
from the qmake
support and move libgcc
to ldlib_common
for all builds.
@ssumpf Unfortunately, ce9b91b breaks _examples/hellorust.
Autoconf apparently also suffers. (experienced while attempting to port gforth on Linux/ARM64 on my MNT-Reform)
With the common, the basic compile test fails because all symbols of libgcc end up in the binary twice. This is probably because ldlibs_common
is passed to configure as both LDLIBS
and LIBS
. When just specifying -lgcc
, this is no problem because one lib can appear any number of times using the -l
argument w/o causing multiple symbol definitions. But the whole-archive
option seems to force the linker to squeeze all symbols of the lib into the binary. If specified twice, the symbols are added twice, ending up at an "double defined symbols" error.
I sense that the wrapping of -lgcc in a whole-archive
block is not what we generally want.
@ssumpf I also noticed that you removed the -nostdlib
option. This is not good because without this option, a bunch of compiler heuristics kick in, which we don't want.
@ssumpf I also noticed that you removed the
-nostdlib
option. This is not good because without this option, a bunch of compiler heuristics kick in, which we don't want.
As far as I understand it, this is covered by
-nostartfiles -nodefaultlibs -static-libgcc
in ldlibs_common
we could change that to -nostdlib
and make it the same for everyone.
Autoconf apparently also suffers. (experienced while attempting to port gforth on Linux/ARM64 on my MNT-Reform)
With the common, the basic compile test fails because all symbols of libgcc end up in the binary twice. This is probably because
ldlibs_common
is passed to configure as bothLDLIBS
andLIBS
. When just specifying-lgcc
, this is no problem because one lib can appear any number of times using the-l
argument w/o causing multiple symbol definitions. But thewhole-archive
option seems to force the linker to squeeze all symbols of the lib into the binary. If specified twice, the symbols are added twice, ending up at an "double defined symbols" error.I sense that the wrapping of -lgcc in a
whole-archive
block is not what we generally want.
Okay, this one is new to me. In this case we want -lgcc for all binaries and the whole-archive
for shared libraries only.
As far as I understand it, this is covered by -nostartfiles -nodefaultlibs -static-libgcc
That's true - at least that was the rationale of 9dcadf757bc5f680457dc1c67ea1458cbee884d8. It seems that I missed adapting qmake.tcl in this respect. So it's good to remove this option. Could you do this in a separate commit?
I have added 39755ba to remove -nostdlib
from qmake.tcl
and made adjustments to use ldlibs_common
for Qt5 apps as well (9f94761). With this all the tests (including Rust), the Linphone-SDK, (with a minor build-system check tweak for arm_v8a), my Qt5 scenarios, and wasmedge are working for me.
P.S. This also resolves the hello_make
static constructor problem.
As we are again orbiting around whole-archive I took yesterday afternoon to get an idea of the actual situation and how several statements of the past fit into this picture.
-static-libgcc
is ineffective in our configuration as -nodefaultlibs
disables the desired libgcc magic. From the manpage: Only the libraries you specify are passed to the linker, and options specifying linkage of the system libraries, such as -static-libgcc or -shared-libgcc, are ignored.--whole-archive
. This overhead applies to all binaries incl. shared libs.-lgcc
is forced to the end of the linker command line, it just works.After the investigation I patched examples for the test ecf97e0b1ea14f2b2212b59c069a9d5cacb7d9db and sketched solutions for common flags e3f855749541d520afd526001cef0e861351a64f as well as cmake 459308b5413a53a45f4b04fdb5a7576da9afecd8. My question is now: Can we walk this road and, thus, wipe some myths and legends associated to this topic?
P.S. This also resolves the
hello_make
static constructor problem.
Could you please tell us the nature of the problem? Which constructor was not called?
Also, 9f947611cad764ceb791ccf6283ede727e35fe53 changes flags.tcl en-passant but the commit message suggests changes (and effects) to qmake only, while all build systems are affected.
As we are again orbiting around whole-archive I took yesterday afternoon to get an idea of the actual situation and how several statements of the past fit into this picture.
1. GCC's `-static-libgcc` is ineffective in our configuration as `-nodefaultlibs` disables the desired libgcc magic. From the manpage: _Only the libraries you specify are passed to the linker, and options specifying linkage of the system libraries, such as -static-libgcc or -shared-libgcc, are ignored._ 2. The size of libgcc is significant. In _examples/cmake_library_. _libforty_two.lib.so_ increases from 14928 to 347856 bytes with `--whole-archive`. This overhead applies to _all_ binaries incl. shared libs. 3. I could not find any _local_ or _hidden_ wizardry in the shared object link. From my investigation, it's just the ancient plain rule of linker command lines that applies here: missing symbols are resolved from the remainder of the arguments to the right (or inside --start-group .. --end-group). So, if `-lgcc` is forced to the end of the linker command line, it just works.
After the investigation I patched examples for the test ecf97e0 and sketched solutions for common flags e3f8557 as well as cmake 459308b. My question is now: Can we walk this road and, thus, wipe some myths and legends associated to this topic?
@chelmuth: I have tried your branch and it seems to work well in most cases. qt5_quicktest
does not link for arm_v8a (undefined reference to __aarch64_ldadd4_acq_rel
), linphone-simple
produces the same undefined reference in the library for the libservicecontrolplugin.lib.so. This can quickly be reproduced using my goa-projects (https://github.com/ssumpf/goa-projects - master branch). Note this has always been a problem on arm_v8a only, I never saw it on x86.
@chelmuth: I will try to get -lgcc to the end of the linker command line for qmake next.
@chelmuth: Okay, -lgcc at the end of the linking command by hand works like a charm! Learned some ancient knowledge today ;) The only question that remains is how to convince the Qt5 build system to do so? @cproc: Do you have any suggestions?
It looks like GENODE_QMAKE_LIBS
needs to be set as well like in https://github.com/genodelabs/genode/blob/master/repos/libports/lib/import/import-qt5_qmake.mk.
It looks like
GENODE_QMAKE_LIBS
needs to be set as well like in https://github.com/genodelabs/genode/blob/master/repos/libports/lib/import/import-qt5_qmake.mk.
@cproc: Yes this does the trick :+1:
@jschlatow: The commits above (my staging branch) are hopefully the last ones regarding this issue. Thanks to our combined knowledge I am pretty happy with this solution and everything works as expected.
Thanks for the collaborative effort. I'm also happy with the result.
I merged the commits and force-pushed to staging to eliminate commit 81ced02.
When I try to run our
waasmedge
project with either 23.04 or 23.10 I get the following error:This uses:
libunwind
,libcxx-abi
andlibcxx
fromllvm
spdlog
wasmedge
These are all built using goa.2ec360cad0930998a03541d62e1c0a24a1559f0c resolves the issue for me, but as I'm no expert on linking this might be far from optimal.
This might be related to #66 and the errors I encountered in https://github.com/genodelabs/genode-world/issues/342