rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

sh: line 1: /opt/rocm/hip/bin/hipcc: No such file or directory #57

Closed lubosz closed 7 months ago

lubosz commented 1 year ago

With rocm 5.6 the build fails with the following error:

sh: line 1: /opt/rocm/hip/bin/hipcc: No such file or directory

The hip-runtime-amd package removed several symlinks to from /opt/rocm/bin/ to /opt/rocm/hip/bin.

$ tar -tf hip-runtime-amd-5.5.1-1-x86_64.pkg.tar.xz | grep bin
opt/rocm/bin/
opt/rocm/bin/.hipVersion
opt/rocm/bin/hipcc
opt/rocm/bin/hipcc.pl
opt/rocm/bin/hipcc_cmake_linker_helper
opt/rocm/bin/hipconfig
opt/rocm/bin/hipconfig.pl
opt/rocm/bin/hipdemangleatp
opt/rocm/bin/hipvars.pm
opt/rocm/bin/roc-obj
opt/rocm/bin/roc-obj-extract
opt/rocm/bin/roc-obj-ls
opt/rocm/hip/bin/
opt/rocm/hip/bin/.hipVersion
opt/rocm/hip/bin/hipcc
opt/rocm/hip/bin/hipcc.pl
opt/rocm/hip/bin/hipcc_cmake_linker_helper
opt/rocm/hip/bin/hipconfig
opt/rocm/hip/bin/hipconfig.pl
opt/rocm/hip/bin/hipdemangleatp
opt/rocm/hip/bin/hipvars.pm
opt/rocm/hip/bin/roc-obj
opt/rocm/hip/bin/roc-obj-extract
opt/rocm/hip/bin/roc-obj-ls
$ tar -tf hip-runtime-amd-5.6.0-1-x86_64.pkg.tar.xz | grep bin
opt/rocm/bin/
opt/rocm/bin/.hipVersion
opt/rocm/bin/hipcc
opt/rocm/bin/hipcc.bat
opt/rocm/bin/hipcc.pl
opt/rocm/bin/hipcc_cmake_linker_helper
opt/rocm/bin/hipconfig
opt/rocm/bin/hipconfig.bat
opt/rocm/bin/hipconfig.pl
opt/rocm/bin/hipdemangleatp
opt/rocm/bin/hipvars.pm
opt/rocm/bin/roc-obj
opt/rocm/bin/roc-obj-extract
opt/rocm/bin/roc-obj-ls
opt/rocm/hip/bin/
opt/rocm/hip/bin/.hipVersion
opt/rocm/hip/bin/roc-obj
opt/rocm/hip/bin/roc-obj-extract
opt/rocm/hip/bin/roc-obj-ls

This is the commit on the hip-runtime-amd package introducing the symlink removal: https://gitlab.archlinux.org/archlinux/packaging/packages/hip-runtime-amd/-/commit/121ba843cac677528eb5409bb40c9135bfd52962

The symlink was not created by the package, but by the rocm build.

A workaround can be done recreating the symlink with

sudo ln -s /opt/rocm/bin/hipcc /opt/rocm/hip/bin/hipcc

which introduces a later error:

Can't open perl script "/opt/rocm/hip/bin//hipcc.pl": No such file or directory

Which can be worked around by:

sudo ln -s /opt/rocm/bin/hipcc.pl /opt/rocm/hip/bin/hipcc.pl

Updating tensorflow to 2.13 did not solve the issue.

Unfortunately these workarounds do not help the build to complete, as i'm running into https://github.com/rocm-arch/tensorflow-rocm/issues/44 now, which was not the case for rocm 5.5.

lubosz commented 1 year ago

This needs to be fixed in upstream tensorflow in rocm_configure.bzl

I have a patch for the package here: https://github.com/lubosz/tensorflow-rocm/commit/9d05e13f68a879b92bdd9d3301d64f1265e72f1c

mpeschel10 commented 1 year ago

@lubosz Just FYI, I copied your patch to tensorflow-amd-git, which is now working again. Thanks a bundle!

acxz commented 10 months ago

This needs to be fixed in upstream tensorflow in rocm_configure.bzl

Do you mind submitting an upstream issue and linking it back here so that upstream knows about it? Also if you can create an upstream PR and incorporate that patch as a PR that would be great!

edit: upstream issue https://github.com/tensorflow/tensorflow/issues/61823

acxz commented 10 months ago

patched with https://github.com/rocm-arch/tensorflow-rocm/pull/58

acxz commented 7 months ago

Closed as commit is in 2.15.0