bazel-contrib / rules_cuda

Starlark implementation of bazel rules for CUDA.
https://bazel-contrib.github.io/rules_cuda/
MIT License
92 stars 43 forks source link

nvlink fatal : Unknown option '-host-ccbin' #287

Closed grueyg closed 2 weeks ago

grueyg commented 2 weeks ago

My project compiles well on my original machine (Ubuntu 18.04, GCC 11.4, NVCC 11.8), but when switching to a different machine with Ubuntu 22.04, GCC 11.4, and NVCC 11.8, I encountered build errors.

Here's the cuda_library code I'm using:

load("@rules_cuda//cuda:defs.bzl", "cuda_library")

package(default_visibility = ["//visibility:public"])

cuda_library(
    name = "phantom",
    srcs = glob(["src/*.cu", "src/host/*.cu", "src/ntt/*.cu"]),
    hdrs = glob(["include/*.h", "include/*.cuh", "include/host/*.h"]),
    includes = ["include"],
    copts = [
        "-std=c++17",
    ],
    rdc=True,
    deps = [
        "@local_cuda//:cuda_runtime",
    ],
)

Here's the build command and resulting error:

(base) userA@hr-iw4210:~/exp/myproject$ bazel build //libspu/mpc/cheetah/arith:cheetah_dot_test --@rules_cuda//cuda:enable=True
INFO: Analyzed target //libspu/mpc/cheetah/arith:cheetah_dot_test (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
ERROR: /home/userA/exp/myproject/libspu/third_party/phantom/BUILD.bazel:5:13: Device linking bazel-out/k8-fastbuild/bin/libspu/third_party/phantom/_objs/phantom/phantom_dlink.rdc.pic.o failed: (Exit 1): nvcc failed: error executing command (from target //libspu/third_party/phantom:phantom) /usr/lib/nvidia-cuda-toolkit/bin/nvcc -dlink -Xcompiler -fPIC -ccbin /usr/bin/gcc -o bazel-out/k8-fastbuild/bin/libspu/third_party/phantom/_objs/phantom/phantom_dlink.rdc.pic.o --expt-relaxed-constexpr ... (remaining 34 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
nvlink fatal   : Unknown option '-host-ccbin'
Target //libspu/mpc/cheetah/arith:cheetah_dot_test failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 0.576s, Critical Path: 0.25s
INFO: 40 processes: 40 internal.
FAILED: Build did NOT complete successfully

I reviewed the rules_cuda source code, and there doesn't seem to be any mention of -host-ccbin. Could you help identify the source of this option or suggest how to resolve this issue? I appreciate any guidance you can provide on this issue.

grueyg commented 2 weeks ago

It seems that switching to NVCC 11.7 resolved the issue, although I’m still unsure about the root cause.