NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.08k stars 14.13k forks source link

Some `libc++abi` symbols are missing with LLVM12+'s `stdenv` on Darwin #166205

Closed rrbutani closed 7 months ago

rrbutani commented 2 years ago

Problem

When using stdenvs (or even just libc++ versions, really) from llvmPackages corresponding to LLVM 12 and newer on macOS, symbols related to exceptions (such as __cxa_allocate_exception) are not found by the linker.

This can be reproduced by compiling and linking any C++ program that makes use of exceptions, i.e.:

clang++ -xc++ - <<<"int main() { throw 0; return 5; }" -o /tmp/foo

Results in:

Undefined symbols for architecture arm64:
  "___cxa_allocate_exception", referenced from:
      _main in --bbc70b.o
  "___cxa_throw", referenced from:
      _main in --bbc70b.o
ld: symbol(s) not found for architecture arm64
clang-13: error: linker command failed with exit code 1 (use -v to see invocation)

Comparing the libc++ dylibs used by the wrapped clang++ in the LLVM 12 and 13 stdenvs (i.e. llvmPackages_12.libcxxStdenv) to that used in the LLVM 11 stdenv reveals that while LLVM 11's libc++.dylib reexports such symbols, newer libc++.dylibs do not.

When run through llvm-nm -C and grepped for __cxa_allocate_exception, here is the output for LLVM 11's libc++ dylib:

                 U ___cxa_allocate_exception
                 I ___cxa_allocate_exception (indirect for ___cxa_allocate_exception)

And LLVM 13's:

                 U ___cxa_allocate_exception

Both of these libc++ dylibs have a dynamic dependency on libc++abi which actually provides a definition for this symbol (this can be observed with otool -L <dylib path> which shows the libc++abi.dylib dep and then llvm-nm -C <libc++abi.dylib path> which shows that the symbol is defined in the libc++abi dylib) however only libc++ versions prior to LLVM 12's reexport the symbol.

This is consistent with the libc++ source code. These symbols were initially removed from the main list of symbols reexported and gated on libc++ exception support. However later this logic and the special list of symbols was removed.

Open Questions

I am not certain I understand the changes in the commit linked above; it's not clear to me why libcxxabi was changed to reexport these symbols.

This seems like an ABI break but there is no mention of this being potentially problematic on the CL and the comments added to the changelog in this commit seem to insist that it only adds some reexports to libc++.

Potential Solutions

I am not sure what the "right" solution is but anecdotally it seems like it's common to pass in -lc++abi as a linkopt on macOS.

cc-wrapper already does this on Linux.

To Reproduce

Here's a flake:

{
  inputs = {
    # nixpkgs.url = github:nixos/nixpkgs?ref=1a8754caf8eb8831b904daaf34b2ee737053b383; # commit for the 13.0.1 LLVM bump
    nixpkgs.url = github:nixos/nixpkgs/nixpkgs-unstable; # LLVM 13 is broken on Darwin on 21.11; using < instead of ^ so we get caching
    flake-utils.url = github:numtide/flake-utils;
  };

  # The issue only affects Darwin. Both because symbol rexport is only a thing
  # on mach-o platforms
  # (https://github.com/llvm/llvm-project/blob/4f13b999297140486b2faa1b5d8d7c768fb40dfb/libcxx/src/CMakeLists.txt#L202-L215)
  # and because `-lc++abi` is always added to the cc wrapper for Linux:
  # https://github.com/NixOS/nixpkgs/blob/a785ec661fadba2ded58467f951b51c34631c269/pkgs/build-support/cc-wrapper/default.nix#L382
  outputs = { self, nixpkgs, flake-utils }: flake-utils.lib.eachSystem [ "aarch64-darwin" "x86_64-darwin" ] (system:
    let
      normal = import nixpkgs { inherit system; };
      imp = func: import nixpkgs { inherit system; config.replaceStdenv = { pkgs }: (func pkgs).libcxxStdenv; };
      l_11 = imp (n: n.llvmPackages_11);
      l_12 = imp (n: n.llvmPackages_12);
      l_13 = imp (n: n.llvmPackages_13);
      l_13' = import nixpkgs {
        inherit system;
        # config.replaceStdenv = { pkgs }: pkgs.llvmPackages_13.libcxxStdenv.overrideAttrs (old: {
        #   postFixup = old.postFixup + ''
        #     echo "-lc++abi" >> $out/nix-support/libcxx-ldflags
        #   '';
        # });
        config.replaceStdenv = { pkgs }:
          let
            clang' = pkgs.llvmPackages_13.libcxxClang.overrideAttrs (old: {
              # We _should_ use an overlay or something similar here but they allegedly don't interact well with `replaceStdenv`.
              #
              # We're essentially adding this: https://github.com/NixOS/nixpkgs/blob/3491c5ea290bca5437845b6348919fcb23950af9/pkgs/build-support/cc-wrapper/default.nix#L382
              postFixup = old.postFixup + ''
                echo "-lc++abi" >> $out/nix-support/libcxx-ldflags
              '';
            });

            # https://github.com/NixOS/nixpkgs/blob/904ed45698338365f086c22ab5af167adf8bee9a/pkgs/development/compilers/llvm/13/default.nix#L241
            stdenv' = pkgs.overrideCC pkgs.stdenv clang';
          in
            stdenv';
      };

      cmd = ''clang++ -xc++ - <<<"int main() { throw 4; return 0; }" -v'';
      test = np: np.runCommandCC "test-cxx" {} ''
        ${cmd} -o $out
      '';
      hook = ''
        cd "$(mktemp -d)"
        libcxx_dir="$(${cmd} -o out -Wl,-v |& tee /dev/stderr | grep -P '\t/nix/store/.*-libcxx-.*/lib' | tail -1)"
        libcxx=$(echo "$libcxx_dir/libc++.dylib" | xargs)

        echo -e "\nlibcxx: $libcxx"
        echo "contains:"
        ${normal.llvmPackages.llvm}/bin/llvm-nm -C "$libcxx" | grep __cxa_allocate_exception
      '';
    in {

    # `nix flake check` shows the error
    checks.l11 = test l_11; # Works fine
    checks.l12 = test l_12; # Fails
    checks.l13 = test l_13; # Fails
    checks.l13' = test l_13'; # A workaround

    devShells = {
      llvm11 = l_11.mkShell { shellHook = hook; };
      llvm12 = l_12.mkShell { shellHook = hook; };
      llvm13 = l_13.mkShell { shellHook = hook; };
    };
  });
}

Running nix flake check shows the error when compiling with the LLVM 12 and 13 stdenvs; entering the dev shells with nix develop prints the path of the libc++ dylib that's used and greps it for __cxa_allocate_exception (one of the missing symbols).

The above also contains an example of a workaround that adds -lc++abi to the wrapper used in the LLVM 13 stdenv.

I think this is reproducible with any nixpkgs version after the LLVM 12.0.0 package was added but just in case, here's my flake.lock file:

Click to expand ```json { "nodes": { "flake-utils": { "locked": { "lastModified": 1648297722, "narHash": "sha256-W+qlPsiZd8F3XkzXOzAoR+mpFqzm3ekQkJNa+PIh1BQ=", "owner": "numtide", "repo": "flake-utils", "rev": "0f8662f1319ad6abf89b3380dd2722369fc51ade", "type": "github" }, "original": { "owner": "numtide", "repo": "flake-utils", "type": "github" } }, "nixpkgs": { "locked": { "lastModified": 1648219316, "narHash": "sha256-Ctij+dOi0ZZIfX5eMhgwugfvB+WZSrvVNAyAuANOsnQ=", "owner": "nixos", "repo": "nixpkgs", "rev": "30d3d79b7d3607d56546dd2a6b49e156ba0ec634", "type": "github" }, "original": { "owner": "nixos", "ref": "nixpkgs-unstable", "repo": "nixpkgs", "type": "github" } }, "root": { "inputs": { "flake-utils": "flake-utils", "nixpkgs": "nixpkgs" } } }, "root": "root", "version": 7 } ```
whitevegagabriel commented 1 year ago

Any updates on this issue?

uri-canva commented 11 months ago

Should we be doing this in the cc wrapper on darwin like we do on linux? It seems weird to require package maintainers to fix this package by package.

judofyr commented 11 months ago

I looked more into this specifically for #269340 and it turns out that the cc-wrapper script will indeed pass in -lc++abi, but only if it detects that it attempts to link C++ code. And Postgis invokes its C compiler:

cc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -Wno-deprecated-non-prototype -g -O2  -bundle -o postgis-3.so postgis_module.o lwgeom_accum.o lwgeom_union.o lwgeom_spheroid.o lwgeom_ogc.o lwgeom_functions_analytic.o lwgeom_inout.o lwgeom_functions_basic.o lwgeom_btree.o lwgeom_box.o lwgeom_box3d.o lwgeom_geos.o lwgeom_geos_prepared.o lwgeom_geos_clean.o lwgeom_geos_relatematch.o lwgeom_generate_grid.o lwgeom_export.o lwgeom_in_gml.o lwgeom_in_kml.o lwgeom_in_marc21.o lwgeom_out_marc21.o lwgeom_in_geohash.o lwgeom_in_geojson.o lwgeom_in_encoded_polyline.o lwgeom_triggers.o lwgeom_dump.o lwgeom_dumppoints.o lwgeom_functions_lrs.o lwgeom_functions_temporal.o lwgeom_rectree.o long_xact.o lwgeom_sqlmm.o lwgeom_rtree.o lwgeom_transform.o lwgeom_window.o gserialized_typmod.o gserialized_gist_2d.o gserialized_gist_nd.o gserialized_supportfn.o gserialized_spgist_2d.o gserialized_spgist_3d.o gserialized_spgist_nd.o brin_2d.o brin_nd.o brin_common.o gserialized_estimate.o geography_inout.o geography_btree.o geography_centroid.o geography_measurement.o geography_measurement_trees.o geometry_inout.o postgis_libprotobuf.o vector_tile.pb-c.o geobuf.pb-c.o mvt.o lwgeom_out_mvt.o geobuf.o lwgeom_out_geobuf.o lwgeom_out_geojson.o flatgeobuf.o lwgeom_in_flatgeobuf.o lwgeom_out_flatgeobuf.o postgis_legacy.o -lm  ../deps/wagyu/libwagyu.la -lc++ ../deps/flatgeobuf/libflatgeobuf.la -lc++ ../libpgcommon/libpgcommon.a ../liblwgeom/.libs/liblwgeom.a  -L/nix/store/r3ci57b9h44dm8bzgddj2s33gmzf0k9d-geos-3.11.2/lib -lgeos_c -L/nix/store/crykfdf8mmkll3a524l34cd80cxczqrm-proj-9.3.0/lib -lproj -L/nix/store/h55yv2msyvgggbxb093za5was5yxfi47-json-c-0.16-dev/lib -ljson-c -L/nix/store/phw2l0iwb43pd3j1hlaml46asziw7vw0-protobuf-c-unstable-2023-07-08/lib -lprotobuf-c -L/nix/store/al2fcmdcsxlx9zv5nsr4464dmrypf2bk-libxml2-2.11.5/lib -lxml2 -L/nix/store/a377d66g1rgsd4vkzxdhkza5g55fkmky-zlib-1.3/lib -lz    -lm -bundle_loader /nix/store/2364df0zry22gjqgsbdz41ypabbvsbm9-postgis-3.4.0/bin/postgres

Changing cc to clang++ will cause this to compile correctly.

Alternatively, applying the following patch also causes the cc-wrapper to detect that this is indeed a C++ linking happening:

diff --git i/pkgs/build-support/cc-wrapper/cc-wrapper.sh w/pkgs/build-support/cc-wrapper/cc-wrapper.sh
index b8d170df01..62075b2e3f 100644
--- i/pkgs/build-support/cc-wrapper/cc-wrapper.sh
+++ w/pkgs/build-support/cc-wrapper/cc-wrapper.sh
@@ -48,6 +48,7 @@ while (( "$n" < "$nParams" )); do
         -nostdlib) cxxLibrary=0 ;;
         -x*-header) dontLink=1 ;; # both `-x c-header` and `-xc-header` are accepted by clang
         -xc++*) isCxx=1 ;;        # both `-xc++` and `-x c++` are accepted by clang
+        -lc++)  isCxx=1 ;;
         -x)
             case "$p2" in
                 *-header) dontLink=1 ;;
ghost commented 9 months ago

https://github.com/NixOS/nixpkgs/blob/c79898ec62912f8ffe7ecf5edc610342649d2a8d/pkgs/servers/sql/postgresql/ext/postgis.nix#L38-L40

Hey it looks like we wound up with 33 copy-and-pastes of this code:

$ rg 'https://github.com/NixOS/nixpkgs/issues/166205' | wc -l
33

This means that when there's a bug in the copy-and-pasted code (which there is -- it breaks CI if stdenv.cc.libcxx exists but lacks a cxxabi subattribute) we need 33 separate fixes.

One of Nix's superpowers is that it provides abstraction, so we can refactor instead of copying and pasting. Please use those superpowers next time -- probably by changing ld-wrapper to add -l${stdenv.cc.libcxx.cxxabi.libName} automatically. That way, if there's a bug it can be fixed in one place instead of having to make thirty-three different changes, like these:

kirillrdy commented 9 months ago

https://github.com/NixOS/nixpkgs/blob/c79898ec62912f8ffe7ecf5edc610342649d2a8d/pkgs/servers/sql/postgresql/ext/postgis.nix#L38-L40

Hey it looks like we wound up with 33 copy-and-pastes of this code:

$ rg 'https://github.com/NixOS/nixpkgs/issues/166205' | wc -l
33

This means that when there's a bug in the copy-and-pasted code (which there is -- it breaks CI if stdenv.cc.libcxx exists but lacks a cxxabi subattribute) we need 33 separate fixes.

One of Nix's superpowers is that it provides abstraction, so we can refactor instead of copying and pasting. Please use those superpowers next time -- probably by changing ld-wrapper to add -l${stdenv.cc.libcxx.cxxabi.libName} automatically. That way, if there's a bug it can be fixed in one place instead of having to make thirty-three different changes, like these:

* [34c3989](https://github.com/NixOS/nixpkgs/commit/34c3989a0e8ce395a6b28397848ee07eade3509e)

* [895dbe8](https://github.com/NixOS/nixpkgs/commit/895dbe8aa68b373c2c033a4c627d4f5b13da7e24)

* [9d96656](https://github.com/NixOS/nixpkgs/commit/9d966564329d31c23301d2517dc999858a704b04)

* [caab415](https://github.com/NixOS/nixpkgs/commit/caab4159953f2bb254602cca50fc89f28703fed1)

* [1bb69e8](https://github.com/NixOS/nixpkgs/commit/1bb69e8dfd520cf5b513699b5043dccc35ed09f4)

* [8670f50](https://github.com/NixOS/nixpkgs/commit/8670f502d0d19b68c617dd42539df1b9605e2486)

* [ad34020](https://github.com/NixOS/nixpkgs/commit/ad340203577eb36090b1afce7ec9c4fc15dff41d)

* [eca13b3](https://github.com/NixOS/nixpkgs/commit/eca13b37fd12f76a9f261427415fbc222423f4f5)

* [0917ca9](https://github.com/NixOS/nixpkgs/commit/0917ca9988103ef0abef67c1715360b1c368d4dd)

* [587ee6e](https://github.com/NixOS/nixpkgs/commit/587ee6e31b0a0c14d8da07d516f0bb9cfe792ddf)

* ... and on, and on ...

@amjoseph-nixpkgs thats a very good find ! we should fix it, is it darwin specific ?

j-baker commented 8 months ago

Hi! This hits internal users where I work about once a week, as they add a native library and have to add this override. By and large, they are unfamiliar with the peculiarities of the linker, and the errors we get are strange as they usually reference other libraries, which is a red herring.

Is there some path to a default fix for this problem? From the above, there are a lot of workarounds being exported (which we apply internally), but there's no direct proposal for an override fix (adding the same override to ld-wrapper?).

Is it reasonable to (internally) automatically apply something which adds this linker flag to everything?

ghost commented 8 months ago

Is there some path to a default fix for this problem? From the above, there are a lot of workarounds being exported (which we apply internally), but there's no direct proposal for an override fix (adding the same override to ld-wrapper?).

the change still needs to go through the review processes.