NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.46k stars 13.66k forks source link

building package with bazel and clang fails due to C++ headers not found #150655

Open avdv opened 2 years ago

avdv commented 2 years ago

[...] if I add pkgs.bazel to my shell.nix and try to build my project, I get errors loading headers, e.g.

external/com_google_protobuf/src/google/protobuf/descriptor_database.h:40:10: fatal error: 'map' file not found
#include <map>

I can resolve this by adding both clang and gcc to my shell.nix. Not one or the other, but both. Which feels very odd.

And unfortunately for me, having both gcc and clang present breaks other things (building ruby native extensions).

Originally posted by @andrewhamon in https://github.com/NixOS/nixpkgs/issues/105573#issuecomment-883622043

avdv commented 2 years ago

Note, this effects bazel-watcher on Darwin (which is currently marked as broken).

I did have a deeper look into this...

Bazel tries to find the C++ compiler and uses the one called "gcc" by default (see https://github.com/bazelbuild/bazel/blob/df2f77c2a8602d1b729b6802ba2bfcac3dc54402/tools/cpp/unix_cc_configure.bzl#L297-L326). But, the bazel package in nixpkgs substitutes "gcc" with "clang" (see https://github.com/NixOS/nixpkgs/blob/2c2a09678ce2ce4125591ac4fe2f7dfaec7a609c/pkgs/development/tools/build-managers/bazel/bazel_3/default.nix#L388-L390)

In a stdenv on Darwin, "clang" is provided by XcodeDefault.xctoolchain which has clang and clang++ wrapper scripts.

Each wrapper script determines if C++ should be "activated" by checking whether it was called with a ++ suffix:

[[ "/nix/store/shvq72iflm8kz4qmdswhla64qypbhxjg-clang-7.1.0/bin/clang" = *++ ]] && isCxx=1 || isCxx=0

Later, the C++ include path is added to the compiler's arguments, but only if in C++ mode:

if [[ "$isCxx" = 1 ]]; then
    if [[ "$cxxInclude" = 1 ]]; then
        NIX_CFLAGS_COMPILE_x86_64_apple_darwin+=" $NIX_CXXSTDLIB_COMPILE_x86_64_apple_darwin"
    fi
    if [[ "$cxxLibrary" = 1 ]]; then
        NIX_CFLAGS_LINK_x86_64_apple_darwin+=" $NIX_CXXSTDLIB_LINK_x86_64_apple_darwin"
    fi
fi

Without the include path, any attempt to compile C++ sources using any of the standard C++ headers fail.

Trying to use export CC=clang++ while building indeed fixes the problem for C++ sources, but fails to compile C sources that are not prepared to be compiled as C++.

Bottom line: Bazel assumes a generic compiler tool, which it expects to select the language automatically depending on the file's extension, but the tooling in nixpkgs only provides distinct scripts for C / C++ which need to be called accordingly.

avdv commented 2 years ago

I could get bazel-watcher to build on darwin using this wrapper script as CC:

#! ${stdenv.shell}
function isCxx() {
   if [ $# -eq 0 ]; then false
   elif [ "$1" == '-xc++' ]; then true
   elif [[ -f "$1" && "$1" =~ [.](hh|H|hp|hxx|hpp|HPP|h[+]{2}|tcc|cc|cp|cxx|cpp|CPP|c[+]{2}|C)$ ]]; then true
   else isCxx "''${@:2}"; fi
}
if isCxx "$@"; then
  exec "${stdenv.cc}/bin/c++" "$@"
else
  exec "${stdenv.cc}/bin/cc" "$@"
fi

Hey @NixOS/bazel :smile: any thoughts on how we can fix this?

uri-canva commented 2 years ago

I tested with Xcode, the wrappers installed by it switch both based on the binary that is called and the extension of the file, so it makes sense for us to provide compatible wrappers with the xcodebuild derivation, even if we don't do that for the wrappers in clang and gcc.

uri-canva commented 2 years ago

I noticed the toolchain in xcodebuild doesn't even have gcc and clang: https://github.com/NixOS/nixpkgs/blob/5a422e169b2d0ee7435f1d16364d2f9f0425b23a/pkgs/development/tools/xcbuild/toolchains.nix

uri-canva commented 2 years ago

@NixOS/darwin-maintainers does anyone have any context or opinions on which of the three options we should go for? They all affect more than just bazel, and that makes sense, since the issue we're hitting with bazel here is that bazel is assuming that if you're using gcc or clang on darwin it's going to be the ones from Xcode, an assumption that a lot of other tools and upstream build definitions are likely to make.

  1. Add the logic to switch c++ or not based on the file extension in cc-wrapper.sh next to the logic that switches based on the binary name: https://github.com/NixOS/nixpkgs/blob/34168e58d1b9c269f376f3d0b5ca850851feb2a6/pkgs/build-support/cc-wrapper/cc-wrapper.sh#L28
    • Pro: next to the existing logic that handles similar things.
    • Con: it's a darwin specific thing in a codepath that is used on Linux as well.
  2. Add a new darwin specific wrapper in clang and clang-unwrapped for this.
    • Pro: isolated to darwin codepaths
  3. Add a wrapper for this in toolchains.nix: https://github.com/NixOS/nixpkgs/blob/5a422e169b2d0ee7435f1d16364d2f9f0425b23a/pkgs/development/tools/xcbuild/toolchains.nix
    • Pro: isolated to darwin codepaths
    • Pro: aligned with user expectations that xcodebuild and its toolchains are compatible with upstream Xcode provided binaries
    • Pro: allows us to easily add gcc and g++ wrappers that forward to clang and clang++ respectively.
    • Con: more complex user interface, results in three different clang variants that all behave differently.
uri-canva commented 2 years ago

My preference is 1 >>> 3 > 2.

veprbl commented 2 years ago

I just realized that I had this this problem with tensorflow in #145149.

uri-canva commented 2 years ago

Good call out! So it's possible other derivations have added their own darwin workarounds for this issue and we could remove them if we have a more compatible wrapper.

veprbl commented 2 years ago

I have another good one. In python there was also a missing C++ compiler support in distutils: https://github.com/NixOS/nixpkgs/blob/c478eaf416411a7dedf773185b6d5bfc966a80ae/pkgs/development/interpreters/python/cpython/default.nix#L244-L258 The distutils was already deprecated at that time, so the patch was never accepted upstream. We ended up putting a lot of effort into maintaining it, as numpy/cpython were relying on this. This, however, will resolve by itself with Python 3.10, where distutils is apparently removed.

Steven0351 commented 2 years ago

@avdv Just a quick note on this:

Trying to use export CC=clang++ while building indeed fixes the problem for C++ sources, but fails to compile C sources that are not prepared to be compiled as C++.

CC is supposed to be used for the C compiler, while CXX is for C++, so that may be the reason you're having issues using clang++ in that regard. You probably would have better luck with this:

export CC=clang
export CXX=clang++
avdv commented 2 years ago

CC is supposed to be used for the C compiler, while CXX is for C++

@Steven0351 every other build tool: yes, but Bazel does not care about CXX and just uses CC exclusively: https://github.com/bazelbuild/bazel/blob/c579155a3ed7c4a9e4745c11a9c9550ec4c121ab/tools/cpp/unix_cc_configure.bzl#L309-L310

def find_cc(repository_ctx, overriden_tools):
    cc = _find_generic(repository_ctx, "gcc", "CC", overriden_tools)

https://github.com/bazelbuild/bazel/blob/c579155a3ed7c4a9e4745c11a9c9550ec4c121ab/tools/cpp/unix_cc_configure.bzl#L281-L282

def _find_generic(repository_ctx, name, env_name, overriden_tools, warn = False, silent = False):
    """Find a generic C++ toolchain tool. Doesn't %-escape the result."""

(shouldn't CXX and CC be already set in stdenv by default already?)

Steven0351 commented 2 years ago

@Steven0351 every other build tool: yes, but Bazel does not care about CXX and just uses CC exclusively: https://github.com/bazelbuild/bazel/blob/c579155a3ed7c4a9e4745c11a9c9550ec4c121ab/tools/cpp/unix_cc_configure.bzl#L309-L310

🤦‍♂️ Apologies, between reading the title and not reading your original post carefully, i thought you were saying you couldn't build the nix package bazel (e.g. pkgs.bazel), not building a package with bazel

(shouldn't CXX and CC be already set in stdenv by default already?)

Yes, yes it should. I should probably refrain from attempting to be helpful at late at night 🥱

avdv commented 2 years ago

between reading the title and not reading your original post carefully, i thought you were saying you couldn't build the nix package bazel (e.g. pkgs.bazel), not building a package with bazel

Yes, that was a bit misleading, I changed the title to make this clearer.

Yes, yes it should. I should probably refrain from attempting to be helpful at late at night yawning_face

No problem, thanks for trying :slightly_smiling_face:

ijcd commented 2 years ago
  1. Add the logic to switch c++ or not based on the file extension in cc-wrapper.sh next to the logic that switches based on the binary name: https://github.com/NixOS/nixpkgs/blob/34168e58d1b9c269f376f3d0b5ca850851feb2a6/pkgs/build-support/cc-wrapper/cc-wrapper.sh#L28

This seems most pragmatic and expedient (perhaps with a guard for being on darwin). Provide a pointer back to this issue so option 2 or 3 can be implemented later if/when necessary.

YorikSar commented 2 years ago

I've ran into this issue and found following workaround for passing C++ headers to Bazel builds: I've added BAZEL_CXXOPTS to --repo_env flag in my global .bazelrc in home-manager config:

  home.file.".bazelrc".text = ''
    build --repo_env=BAZEL_CXXOPTS="-I${pkgs.libcxx.dev}/include/c++/v1"
  '';

You can achieve similar result without home-manager like this:

% nix build nixpkgs#libcxx
% export BAZEL_CXXOPTS="-I$(nix path-info nixpkgs#libcxx.dev)/include/c++/v1"

before running bazel, or like this:

% nix build nixpkgs#libcxx
% echo "build --repo_env=BAZEL_CXXOPTS=\"-I$(nix path-info nixpkgs#libcxx.dev)/include/c++/v1\"" >> ~/.bazelrc

to add it to ~/.bazelrc.

avdv commented 2 years ago

Hi @YorikSar thank you for sharing this workaround!

I was thinking about a way how we can leverage this as a device to fix this issue, but IMO the problem should not be fixed in the bazel layer, because introducing a breaking change in a bazel wrapper (say) when fiddling with --cxxopt or BAZEL_CXXOPTS is too risky.

+1 for option 1 of the choices https://github.com/NixOS/nixpkgs/issues/150655#issuecomment-998407330

toonn commented 2 years ago

Since other build tools know about this distinction it feels like a Bazel issue. If it won't be tackled upstream maybe we can create a new clang wrapper that branches on the file extension and calls the appropriate wrapper script, then just use that for Bazel?

NickCao commented 1 year ago

Hit by the same issue when trying to package workerd on linux, indeed we need a bazel specific cc-wrapper or file an upstream issue.

Artturin commented 1 year ago

@NixOS/darwin-maintainers does anyone have any context or opinions on which of the three options we should go for? They all affect more than just bazel, and that makes sense, since the issue we're hitting with bazel here is that bazel is assuming that if you're using gcc or clang on darwin it's going to be the ones from Xcode, an assumption that a lot of other tools and upstream build definitions are likely to make.

1. Add the logic to switch c++ or not based on the file extension in `cc-wrapper.sh` next to the logic that switches based on the binary name: https://github.com/NixOS/nixpkgs/blob/34168e58d1b9c269f376f3d0b5ca850851feb2a6/pkgs/build-support/cc-wrapper/cc-wrapper.sh#L28

   * Pro: next to the existing logic that handles similar things.
   * Con: it's a darwin specific thing in a codepath that is used on Linux as well.

     * Counterpoint: the binary name logic is also darwin specific and it's there too: [Some cc-wrapper changes to better support darwin and clang #6254](https://github.com/NixOS/nixpkgs/pull/6254)

In https://github.com/NixOS/nixpkgs/issues/256360 the reporter says

The clang installed by the underlying system (Arch) has no trouble compiling this file.

So if a clang on arch can compile both c and c++ files then ours should be able to too, unless there are drawbacks to detecting the mode by file extension.

toonn commented 1 year ago

So the con isn't actually valid then? Does Linux still not have the binary name distinction logic?

uri-canva commented 12 months ago

So it's a difference between NixOS and non-NixOS, not a difference between macOS and linux.

We should fix it by making clang in nixpkgs match the behaviour of other distros.

hzeller commented 12 months ago

+1

I've evaluated the issue with the compiler in #216047 which is pretty much a duplicate of this issue here.

https://github.com/NixOS/nixpkgs/issues/216047#issuecomment-1731927444

Indeed, the main reason is that clang does not behave as C++ compiler when it gets a C++ file presented. This is unlike any other Linux distribution I have seen (and also unlike the expectation bazel has). So yes, making the nix-os clang work like compiler drivers in other operating systems, auto-detecting if it deals with C or C++ is the right solution.

hzeller commented 12 months ago

Is there someone from lib.teams.llvm.members that we can ask to help getting this implemented ?

toonn commented 11 months ago

I'm looking into this. There's nothing wrong with Clang. It still detects the files are C++ and behaves accordingly. The problem is it has no idea where to find the C++ stdlib. I have a pretty simple fix for this but I'm doing some minimal testing first. Since it's a change to cc-wrapper it'll have to go through serious review and staging.

Artturin commented 11 months ago

I'm looking into this. There's nothing wrong with Clang. It still detects the files are C++ and behaves accordingly. The problem is it has no idea where to find the C++ stdlib. I have a pretty simple fix for this but I'm doing some minimal testing first. Since it's a change to cc-wrapper it'll have to go through serious review and staging.

Please submit a PR for review

nixos-discourse commented 11 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-macos-monthly/12330/42

hzeller commented 11 months ago

Do you have a pointer to your patch in progress to play with @toonn ? I've seen in the aforementioned discourse discussion that you had some issues with binutils. Maybe more eyes can help ?

It would be wonderful if clang in 23.11 would be able to find the c++ headers.

nixos-discourse commented 11 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/how-to-search-nixpkgs-by-e-g-buildinputs/34507/1

nixos-discourse commented 10 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-macos-monthly/12330/44

Artturin commented 9 months ago

https://github.com/NixOS/nixpkgs/pull/269694

hzeller commented 9 months ago

For easy testing, here is a one-liner to reproduce the issue at hand:

nix-shell  \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.clang13Stdenv.mkDerivation { name="foo";}' \
    --run 'echo "#include <vector>" > /tmp/foo.cc ; clang -c /tmp/foo.cc -o /tmp/foo.o'

It will complain about not finding the c++ header.

@toonn were you able to make progress ?

hzeller commented 7 months ago

For everyone running into this: Until this is available in the NixOS clang configuration, I've worked around this in my project by adding the-xc++ flag in the .bazelrc as option to be passed to the compiler. That way bazel will add the necessary option, and from there, the now invoked clang -xc++ will find its includes.

So roughly a line like this in the .bazelrc

build --cxxopt=-xc++ --host_cxxopt=-xc++

(relevant PR https://github.com/chipsalliance/verible/pull/2104)

filmil commented 3 weeks ago

yone running into this: Until this is available in the NixOS clang configuration, I've worked around this in my project by adding the-xc++ flag in the .bazelrc as option to be passed to the compiler.

Unfortunately, this only works if you are using the compiler directly. If you are passing the compiler binary name, apparently it does not work.

To explain a bit, kythe is a code indexer that uses the cc binary to replay compilation. When it uses the cc wrapper from nix, it will not use --cxx... by default, leading to failures.

Would be nice to remove this lever permanently and allow C++ compilations to pass even if invoked as cc instead of cc++.

IIRC someone had a patch that simply adds the C++ include paths to the default system paths of cc. Are there any special downsides to this (easy) fix?

hzeller commented 2 weeks ago

Updating minimal example in comment above - this is still an issue with clang-17

nix-shell  \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.clang17Stdenv.mkDerivation { name="foo";}' \
    --run 'echo "#include <vector>" > /tmp/foo.cc ; clang -c /tmp/foo.cc -o /tmp/foo.o'
/tmp/foo.cc:1:10: fatal error: 'vector' file not found
    1 | #include <vector>
      |          ^~~~~~~~
aaomidi commented 2 weeks ago

A workaround I’ve found is including the llvm toolchain in bazel directly. The downside of this is: 1) not nix managed 2) it adds a good 90 seconds to the initial analysis from a cold bazel server.

filmil commented 2 weeks ago

1) not nix managed

Sadly the whole point of this exercise is to have it managed by nix.

2) it adds a good 90 seconds to the initial analysis from a cold bazel server.

Ow, that seems like a lot. Are you sure you always pay this, vs only when the toolchain needs to be downloaded?