rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

gcc: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory #56

Closed mpeschel10 closed 2 months ago

mpeschel10 commented 1 year ago

The line export GCC_HOST_COMPILER_PATH=/opt/cuda/bin/gcc causes this error:

INFO: Found 4 targets...
ERROR: /home/mpeschel/.cache/bazel/_bazel_mpeschel/ab77753f39f15e118fceab83905745b1/external/flatbuffers/src/BUILD.bazel:9:11: Compiling src/util.cpp failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE -fstack-protector -Wall -Wunused-but-set-parameter -Wno-free-nonheap-object -fno-omit-frame-pointer ... (remaining 34 arguments skipped)
/home/mpeschel/.cache/bazel/_bazel_mpeschel/ab77753f39f15e118fceab83905745b1/execroot/org_tensorflow/external/local_config_rocm/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc:23: DeprecationWarning: 'pipes' is deprecated and slated for removal in Python 3.13
  import pipes
gcc: fatal error: cannot execute ‘cc1plus’: execvp: No such file or directory
compilation terminated.
INFO: Elapsed time: 3.295s, Critical Path: 0.07s

However, I definitely have cc1plus on my system:

[mpeschel@suzerain tensorflow-rocm]$ find /usr/lib -name cc1plus
/usr/lib/gcc/x86_64-pc-linux-gnu/13.1.1/cc1plus
/usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/cc1plus

I suspect the problem is that /opt/cuda/bin/gcc is a symlink:

[mpeschel@suzerain ~]$ ls -l /opt/cuda/bin/gcc
lrwxrwxrwx 1 root root 15 Jun 17 14:10 /opt/cuda/bin/gcc -> /usr/bin/gcc-12

When I replace the line with export GCC_HOST_COMPILER_PATH=/usr/bin/gcc-12, I can continue the build.

Now, /opt/cuda/bin/gcc comes from the upstream PKGBUILD. We know it works for them since they build tensorflow-cuda for the repositories. I have no idea what I'm missing. Does this work for anyone else? The arch4edu build swaps it out for gcc-11, so clearly some people have trouble with it. The commit that introduced the change unfortunately does not clarify how it should work.

If nobody has a better idea, I will wait until pull #55 is closed before opening a pull request with the resolved symlink.

lubosz commented 1 year ago

I suppose this is a relict from the cuda version of this package. I am also replacing this with an old GCC version, 11 worked for me last time. If 12 works that would be more optimal, since it's packaged.

The /opt/cuda/bin/gcc is definitely wrong, since the package does not depend on cuda and it shouldn't for a symlink.

acxz commented 1 year ago

patched with cf344c9

acxz commented 2 months ago

This is not a patch anymore, as the fix is proper. Therefore closing the issue. Note: upstream piggy backs off of NVCC_CCBIN and uses

  export GCC_HOST_COMPILER_PATH="${NVCC_CCBIN/++/cc}"

https://gitlab.archlinux.org/archlinux/packaging/packages/tensorflow/-/blob/main/PKGBUILD?ref_type=heads#L116-118