NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
16.48k stars 12.97k forks source link

Package request: `llvmPackages.stdenv` should be an actual Clang/LLVM toolchain #277564

Open aaronmondal opened 6 months ago

aaronmondal commented 6 months ago

Project description

At the moment there are various flavors for compilers and stdenvs in llvmPackages. I believe most of the existing configurations are not very useful and the most important one is missing.

A crucial difference between Clang and GCC is that Clang lowers to LLVM IR. Emitting this IR is what enables the various sometimes exotic toolchains that are built on top of Clang. When used as a C++ compiler this is not noticeable since the object files emitted are just regular object files.

This is not true for the use cases that build on top of clang though. Toolchains like the ROCm and (clang-driven) CUDA work with "raw" LLVM bitcode and require llvm tooling like llvm-ar and friends. Unlike the "standard" gnu bintools the LLVM variants support processing bitcode and are required for these usecases.

Essentially, when you're using LLVM for anything that actually requires LLVM chances are that you need a pure LLVM toolchain, including clang, lld and the llvm bintools.

Technically, a similar argument doesn't hold for libstdc++ vs libc++, but I'd be most surprised if anyone deliberately wanted to use LLVM without libc++/libc++abi as it "allows" static linking, similar to musl vs glibc.

Proposition

  1. Make llvmPackages_x.stdenv a pure llvm stdenv. That is, pull all dependencies exposed via that toolchain from the LLVM monorepo, including the llvm bintools, the lld linker, the C++ standard library, unwind library and compiler runtime. This would give us a toolchain that has no dependency on GCC.

A toolchain that somewhat resembles this behavior is this, although it seems like a fairly hacky solution that still has some GCC residuals (see https://github.com/TraceMachina/nativelink/blob/main/tools/llvmStdenv.nix):

pkgs.overrideCC (
  llvmPackages.libcxxStdenv.override {
    targetPlatform.useLLVM = true;
  }
)
llvmPackages.clangUseLLVM
  1. Either remove the other stdenvs or rework them in a way that they're "modified" versions of the pure toolchain rather than being modifications of the GCC toolchain. In other words, override the pure toolchain with non-llvm dependencies rather than building it up from the GCC stdenv. For instance, something like llvmPackages.libstdcxxStdenv is much more intuitive than llvmPackages.libcxxStdenv.

This has been bothering me for quite a long time now and I'd be happy to work on this issue. I kept running into this during implementation of rules_ll, re-encountered this when implementing internal ML toolchains (TF/Torch/JAX) and am now hitting this again with Rust toolchains in nativelink. I'd really like to bring llvmPackages into a state where getting a pure Clang/LLVM toolchain is a trivial one-liner.


Add a :+1: reaction to issues you find important.

nixos-discourse commented 3 months ago

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/building-packages-with-cflags-gentoo-style/6173/15

RossComputerGuy commented 2 months ago

This sounds like a great thing to have. Might have a look into it once I have things commonified inside of Nixpkgs and I have free time on my plate.