xmake-io / xmake

🔥 A cross-platform build utility based on Lua
https://xmake.io
Apache License 2.0
9.97k stars 780 forks source link

Cannot link to system libraries due to ABI mismatch when modules are used #3855

Open DavidTruby opened 1 year ago

DavidTruby commented 1 year ago

Xmake Version

2.7.9

Operating System Version and Architecture

Gentoo Linux

Describe Bug

When modules are enabled, xmake unconditionally passes a flag switching g++ to use the old pre-C++11 abi: https://github.com/xmake-io/xmake/blob/68153d06e61342aa2be927116c9b7d4e40125945/xmake/rules/c%2B%2B/modules/modules_support/gcc.lua#L77 This causes failures to link, and also sometimes runtime failures, due to ABI mismatch as Linux distrobutions build with the new ABI. For example if calling a library function that takes a std::string from a library compiled without this flag you will either get a linker failure, or worse a runtime segfault, because std::string has a different implementation with or without the flag.

The old ABI is broken and non-standards conformant so it shouldn't really be being used anyway. If a specific ABI needs to be forced for some reason, we should flip the flag and pass -D_GLIBCXX_USE_CXX11_ABI=1 to force the new ABI instead.

Expected Behavior

Successful linkage with system libraries because both are using the new ABI

Project Configuration

xmake.lua:

add_rules("mode.debug", "mode.release")

set_languages("c++latest")

add_requires("llvm", {system = true, kind = "library"})

target("test")
    set_kind("binary")
    add_files("src/*.cpp")
    add_packages("llvm")
    set_policy("build.c++.modules", true)

src/main.cpp:

import <llvm/Support/raw_ostream.h>;
import <string>;

int main() {
  std::string s = "hello world";
  llvm::outs() << s << "\n";
}

Additional Information and Error Logs

No response

waruqi commented 1 year ago

just solve this issue. https://github.com/xmake-io/xmake/issues/2716#issuecomment-1225057760

DavidTruby commented 1 year ago

I think that fix isn't correct: that's what is introducing the -D_GLIBCXX_USE_CXX11_ABI=0 that forces the old ABI. The problem with that is e.g. my libLLVM.so on my system is built with the new ABI (as it would be on any linux distro since about 2014) so if I link to that it breaks. I think the fix should be -D_GLIBCXX_USE_CXX11_ABI=1 instead to force the new ABI

DavidTruby commented 1 year ago

I tried making that change myself and it seems to cause other issues but I'm not sure why... Forcing the old ABI will cause issues other than linkage though (for example, the old ABI is not thread safe)

DavidTruby commented 1 year ago

Actually, that was because the cache was not cleared. Setting the new ABI as I mentioned above does seem to work, at least on my system.

waruqi commented 1 year ago

I think that fix isn't correct: that's what is introducing the -D_GLIBCXX_USE_CXX11_ABI=0 that forces the old ABI. The problem with that is e.g. my libLLVM.so on my system is built with the new ABI (as it would be on any linux distro since about 2014) so if I link to that it breaks. I think the fix should be -D_GLIBCXX_USE_CXX11_ABI=1 instead to force the new ABI

But it still will break it in previous issue. https://github.com/xmake-io/xmake/issues/2716#issuecomment-1225057760

this example: https://github.com/xmake-io/xmake/files/9404087/ModuleTest.zip

ruki@2c6dd6a46935:/mnt/ModuleTest/01_Person$ rm -rf build; xmake -rv
[  0%]: generating.module.deps src/test.cpp
checking for flags (-D_GLIBCXX_USE_CXX11_ABI=1) ... ok
/usr/bin/gcc -E -x c++ -m64 -std=c++20 -D_GLIBCXX_USE_CXX11_ABI=1 src/test.cpp -o build/.gens/hello/linux/x86_64/release/rules/modules/cache/2d9b7c0b/test.cpp.i
[  0%]: generating.module.deps src/Person.cppm
/usr/bin/gcc -E -x c++ -m64 -std=c++20 -D_GLIBCXX_USE_CXX11_ABI=1 src/Person.cppm -o build/.gens/hello/linux/x86_64/release/rules/modules/cache/2d9b7c0b/Person.cppm.i
[ 12%]: compiling.headerunit.release string
/usr/bin/gcc -m64 -std=c++20 -fmodules-ts -fmodule-mapper=/tmp/.xmake999/230618/_9023844D671F411082EC3CD3F382D320 -D_GLIBCXX_USE_CXX11_ABI=1 -c -x c++-system-header string
[ 12%]: compiling.headerunit.release iostream
/usr/bin/gcc -m64 -std=c++20 -fmodules-ts -fmodule-mapper=/tmp/.xmake999/230618/_75DDC8EEC1CF493087E45CB08170DA30 -D_GLIBCXX_USE_CXX11_ABI=1 -c -x c++-system-header iostream
[ 37%]: compiling.module.release person
/usr/bin/gcc -c -x c++ -m64 -std=c++20 -fmodules-ts -fmodule-mapper=build/hello/mapper.txt -D_GLIBCXX_USE_CXX11_ABI=1 -o build/.objs/hello/linux/x86_64/release/src/Person.cppm.o src/Person.cppm
[ 62%]: compiling.release src/test.cpp
/usr/bin/gcc -c -m64 -std=c++20 -fmodules-ts -fmodule-mapper=build/hello/mapper.txt -D_GLIBCXX_USE_CXX11_ABI=1 -o build/.objs/hello/linux/x86_64/release/src/test.cpp.o src/test.cpp
[ 75%]: linking.release hello
/usr/bin/g++ -o build/linux/x86_64/release/hello build/.objs/hello/linux/x86_64/release/src/test.cpp.o build/.objs/hello/linux/x86_64/release/src/Person.cppm.o -m64
/usr/bin/ld: build/.objs/hello/linux/x86_64/release/src/test.cpp.o: in function `main':
test.cpp:(.text+0x15a): undefined reference to `Person::getLastName() const'
/usr/bin/ld: test.cpp:(.text+0x190): undefined reference to `Person::getFirstName() const'
collect2: error: ld returned 1 exit status

only -D_GLIBCXX_USE_CXX11_ABI=0 can fix it.

DavidTruby commented 1 year ago

What OS/version are you still seeing these issues on? I can take a look into it. I really should emphasise though, setting -D_GLIBCXX_USE_CXX11_ABI=0 is not a fix. Perfectly valid programs will silently fail in unexpected ways at runtime if the old ABI is forced, because the old ABI is fundamentally broken. This will especially be true if threads are involved. Failing to build at all would be better than silently introducing runtime failures or incorrect behaviour.

DavidTruby commented 1 year ago

I get the same issue when building a very simple test case without xmake just using the command line, and without using module header units for string or iostream. I think there's a bug in GCC with it specifying the wrong ABI in the module file when std::string is involved, although I'm surprised nobody would have noticed this before.

waruqi commented 1 year ago

What OS/version are you still seeing these issues on? I can take a look into it. I really should emphasise though, setting is not a fix. Perfectly valid programs will silently fail in unexpected ways at runtime if the old ABI is forced, because the old ABI is fundamentally broken. This will especially be true if threads are involved. Failing to build at all would be better than silently introducing runtime failures or incorrect behaviour.-D_GLIBCXX_USE_CXX11_ABI=0

ubuntu 22.10

Failing to build at all would be better than silently introducing runtime failures or incorrect behaviour.-D_GLIBCXX_USE_CXX11_ABI=0

But at least the previous problem was fixed and is working fine, with no further unusual feedback from users.

If set -D_GLIBCXX_USE_CXX11_ABI=1 or remove it, although it may fix your problem, it will also break other users' projects before.

Unless there is a better solution that solves both problems, I'm not going to change it for now.

DavidTruby commented 1 year ago

What I'm trying to say is that setting this flag doesn't actually fix the problem for anyone. It may allow programs built with it to compile but they'll just fail at runtime, often in subtle and difficult to debug ways. In my opinion silent runtime errors/incorrect semantics are worse than a compile error.

The issue is a bug in gcc that needs fixing on their side, but working around it by silently introducing bugs into user code doesn't seem worthwhile until the gcc bug is fixed. At the very least there should be a warning that this flag is being added, because it breaks C++ semantics.

waruqi commented 1 year ago

What I'm trying to say is that setting this flag doesn't actually fix the problem for anyone. It may allow programs built with it to compile but they'll just fail at runtime, often in subtle and difficult to debug ways. In my opinion silent runtime errors/incorrect semantics are worse than a compile error.

At least in my case, it runs fine too.

The issue is a bug in gcc that needs fixing on their side, but working around it by silently introducing bugs into user code doesn't seem worthwhile until the gcc bug is fixed. At the very least there should be a warning that this flag is being added, because it breaks C++ semantics.

Since it is the result of a bug in gcc, I will only consider setting it up and turning it off automatically based on the gcc version once it is fixed by gcc and a new release is available. Until then, I'll have to keep this setting for now.

But I've added a new policy that allows you to optionally turn on new abi.

set_languages("c++20")
set_policy("build.c++.gcc.modules.cxx11abi", true)
target("hello")
    set_kind("binary")
    add_files("src/*.cpp", "src/*.cppm")
mauro-balades commented 3 months ago

any progress on this?

DavidTruby commented 2 months ago

This should be fixed in gcc14 I believe, I haven't tested it with xmake yet though.