NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
18.36k stars 14.32k forks source link

bazel fails to build in clang stdenv #216047

Open hzeller opened 1 year ago

hzeller commented 1 year ago

Describe the bug

Compilation using bazel in a clang13Stdenv environment fails.

Something similar shows up in Darwin, which as clang as default environment, but this is to demonstrate with a self-contained example that this is a general problem with an isClang environment.

Steps To Reproduce

I've created a self-contained example repository with a small project that uses C and C++ files

https://github.com/hzeller/nix-bazel-with-clang

git clone https://github.com/hzeller/nix-bazel-with-clang
cd nix-bazel-with-clang
nix-shell --command 'bazel clean ; bazel build //... && bazel-bin/main'

This example provides a shell.nix that enables clang13Stdenv

{ pkgs ? import <nixpkgs> {} }:
let
  #used_stdenv = pkgs.stdenv;         # this works, using gcc
  used_stdenv = pkgs.clang13Stdenv;   # this is broken, using clang
in
used_stdenv.mkDerivation {
  name = "Testing c and c++ compilation";
  buildInputs = with pkgs;
    [
      bazel_4
    ];
}

Expected behavior

This should compile and run the small main program in bazel-bin/main, outputting

in C   function foo() -> 42
in C++ function bar() -> 42

== SUCCESS ==
got values 42 and 42

... this happens instead

... but what actually happens is that the compilation fails:

INFO: Analyzed 2 targets (15 packages loaded, 63 targets configured).
INFO: Found 2 targets...
ERROR: /home/testuser/src/my/bazel-with-clang/BUILD:13:10: Compiling main.cc failed: (Exit 1): clang failed: error executing command /nix/store/i973rwv7n9pq74iac8jlly9s3xlrr0bc-clang-wrapper-13.0.1/bin/clang -U_FORTIFY_SOURCE -fstack-protector -Wall -Wthread-safety -Wself-assign -Wunused-but-set-parameter -Wno-free-nonheap-object ... (remaining 25 argument(s) skipped)

Use --sandbox_debug to see verbose messages from the sandbox
main.cc:1:10: fatal error: 'iostream' file not found
#include <iostream>
         ^~~~~~~~~~
1 error generated.
INFO: Elapsed time: 0.216s, Critical Path: 0.03s
INFO: 12 processes: 11 internal, 1 linux-sandbox.
FAILED: Build did NOT complete successfully

Additional context

Symptoms of this issue have been described in other bugs, but often in the Darwin context #150655 Also seen in unsuccessful pull request attempting to add Darwin support #214797 People have sometimes applied various workarounds successfully, but this of course need to work 'out of the box'. Darwin is not the main culprit though, it is just surfacing the issue because there, the stdenv is based on clang.

The problem is more the way bazel and the compiler environment is wired up, setting flags and possibly providing compiler-wrappers needed, so I hope with the self-contained example in the git repo mentioned above it is possible for the nix-bazel maintainers to get to the root of the problem.

Notify maintainers

CC @NixOS/bazel @aherrmann @ylecornec Also FYI @uri-canva

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

 - system: `"x86_64-linux"`
 - host os: `Linux 5.15.90, NixOS, 22.11 (Raccoon), 22.11.2203.285b3ff0660`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.11.1`
 - channels(root): `"nixos-22.11"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
hzeller commented 1 year ago

Addendum: the above nix-info is with 22.11, but same behavior also with nixos-unstable.

$ nix-shell -p nix-info --run "nix-info -m" 
 - system: `"x86_64-linux"`
 - host os: `Linux 5.15.93, NixOS, 23.05 (Stoat), 23.05pre453471.e5530aba13c`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.13.2`
 - channels(root): `"nixos"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`
aherrmann commented 1 year ago

I see that the example does not set up any toolchain explicitly, so it's going to use Bazel's builtin autodetecting cc toolchain. You can look through the files under $(bazel info output_base)/external/local_config_cc to inspect the generated configuration. You can find the corresponding files from the Bazel distribution here.

In rules_nixpkgs we found that a clang based cc-toolchain requires additional compiler and linker flags to work with Bazel. You can find that configuration here.

aaronmondal commented 1 year ago

You might be interested in the toolchain we are using in the rules_ll flake which wraps Bazel in a Clang/LLVM toolchain.

hzeller commented 1 year ago

After a bit of stall on this issue here, I dug into the underlying reason and possible fix suggestions.

TL;DR

bazel uses the canonical name of the compiler to compile c++, gcc in stdenv and clang with clang13Stdenv.

In NixOS,

What bazel uses as compiler

The problem seems to be that bazel by default uses the generic compiler driver for c++, so depending on the stdenv, this would be gcc or clang. Looking at the cc_wrapper.sh in the local config as suggested by @aherrmann to inspect the environment shows the compiler bazel uses to compile c++ code

Here, using stdenv:

mkdir -p /tmp/foo ; touch /tmp/foo/WORKSPACE ; cd /tmp/foo 
nix-shell \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.stdenv.mkDerivation { name="foo"; buildInputs = [ pkgs.bazel_6 ];}' \
    --run 'bazel clean --expunge ; bazel sync ; cat $(bazel info output_base)/external/local_config_cc/cc_wrapper.sh'

... the wrapper script emits that it uses plain gcc:

# ...
# Call the C++ compiler
/nix/store/75slks1wr3b3sxr5advswjzg9lvbv9jc-gcc-wrapper-12.3.0/bin/gcc "$@"

... and here, using clang13Stdenv

mkdir -p /tmp/foo ; touch /tmp/foo/WORKSPACE ; cd /tmp/foo 
nix-shell \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.clang13Stdenv.mkDerivation { name="foo"; buildInputs = [ pkgs.bazel_6 ];}' \
    --run 'bazel clean --expunge ; bazel sync ; cat $(bazel info output_base)/external/local_config_cc/cc_wrapper.sh'

Now we see that bazel chooses plain clang

# ...
# Call the C++ compiler
/nix/store/b4y1vzkybpx8ggfrdrn47r21dh44kj0j-clang-wrapper-13.0.1/bin/clang "$@"

Can these compilers deal with C++ ?

So, in both cases, bazel chose the common name of the plain compiler gcc and clang assuming that if given a c++ program they behave like g++ and clang++.

This does indeed work with gcc in nix-os, but does not work with clang.

Let's compile a super-simple c++ program including a c++ header with both gcc and clang:

With gcc

nix-shell  \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.stdenv.mkDerivation { name="foo";}' \
    --run 'echo "#include <vector>" > /tmp/foo.cc ; gcc -c /tmp/foo.cc -o /tmp/foo.o'

This works!

... with clang

nix-shell  \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.clang13Stdenv.mkDerivation { name="foo";}' \
    --run 'echo "#include <vector>" > /tmp/foo.cc ; clang -c /tmp/foo.cc -o /tmp/foo.o'

This emits an error:

/tmp/foo.cc:1:10: fatal error: 'vector' file not found
#include <vector>
         ^~~~~~~~
1 error generated.

The error indicates that it behaves like a c-compiler and not as a c++ compiler that knows how where to find c++ headers.

It does work, if we use clang++ as compiler:

nix-shell  \
    -E '{ pkgs ? import <nixpkgs> {} }: pkgs.clang13Stdenv.mkDerivation { name="foo";}' \
    --run 'echo "#include <vector>" > /tmp/foo.cc ; clang++ -c /tmp/foo.cc -o /tmp/foo.o'

Conclusion, possible solutions

So all we need is bazel invoking clang++ instead of clang.

There see two possible simple solutions

I think I would prefer the first solution (fix clang to behave like in other Linux distributions and like gcc), but I suspect it will be easier to implement the second.

aaronmondal commented 1 year ago

One possibility to get a nix-backed C++ toolchain for Bazel that uses the default clang is to generate it via Bazel-remote. We've been using this approach for quite some time now and it kind of "just works".

This setup generates a cc toolchain config that uses tools from nixpkgs (like this).

This exact approach is probably not what you'd want in nixpkgs, but a similar approach might be possible. I think a nice solution could be a tool that does the same kind of toolchain autogeneration as bazel-remote but directly generates the cc toolchain configs from the nixpkgs inputs instead of using this roundabout way via container images.

uri-canva commented 1 year ago

Dupe of #150655? Sorry should have noted this when you CC'd me initially, must have missed the notification.

hzeller commented 1 year ago

Yes, looks like this and the other issue describe the same problem @uri-canva