rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

Error while installing tensorflow-rocm #52

Closed KalilovM closed 10 months ago

KalilovM commented 1 year ago
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: in cc_library rule //tensorflow/compiler/xla/pjrt:pjrt_future: target '@tf_runtime//:support' is not visible from target '//tensorflow/compiler/xla/pjrt:pjrt_future'. Check the visibility declaration of the former target if you think the dependency is legitimate
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: Analysis of target '//tensorflow/compiler/xla/pjrt:pjrt_future' failed
ERROR: Analysis of target '//tensorflow:libtensorflow.so' failed; build aborted:
INFO: Elapsed time: 12.645s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (541 packages loaded\
, 23635 targets configured)
    currently loading: @arm_neon_2_x86_sse// ... (11 packages)
==> ERROR: A failure occurred in build().
    Aborting...
 -> error making: tensorflow-rocm-exit status 4
checking dependencies...

6.3.7-arch1-1 python-3.11.3

mpeschel10 commented 1 year ago

Hi. Your error is as far as I ever got compiling this PKGBUILD.

Instead, I wrote my own PKGBUILD for tensorflow-rocm based on the official docker image. If you're interested, try:

git clone https://aur.archlinux.org/tensorflow-amd-git.git
cd tensorflow-amd-git
makepkg -s
pacman -U tensorflow-amd-git*.pkg.tar.zst
pacman -U python-tensorflow-amd-git*.pkg.tar.zst
python test.py

Please let me know how it goes ; if my build works for other people, I will try to get the changes merged in this repository. Edit: Apparently acxz intended for this to be eventually merged with extra repository tensorflow PKGBUILD. So my stuff, which uses the tensorflow-rocm upstream, probably will not make it in.

Edit 2: Actually, I've been trying to reproduce this, and I can't anymore. Could you post the output of pacman -Qtt and any modifications you've made to the PKGBUILD?

rustatian commented 1 year ago

Hey 👋🏻 Got the similar error on clean install:

INFO: Found applicable config definition build:dynamic_kernels in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: in cc_library rule //tensorflow/compiler/xla/pjrt:pjrt_future: target '@tf_runtime//:support' is not visible from target '//tensorflow/compiler/xla/pjrt:pjrt_future'. Check the visibility declaration of the former target if you think the dependency is legitimate
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: Analysis of target '//tensorflow/compiler/xla/pjrt:pjrt_future' failed

ERROR: Analysis of target '//tensorflow:libtensorflow.so' failed; build aborted:
rustatian commented 1 year ago

@mpeschel10 Tried your build, also failed (but with different error) 😢 :

FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.ExceptionInInitializerError
    at com.google.devtools.build.lib.actions.ParameterFile.writeContent(ParameterFile.java:118)
    at com.google.devtools.build.lib.actions.ParameterFile.writeParameterFile(ParameterFile.java:111)
    at com.google.devtools.build.lib.analysis.actions.ParameterFileWriteAction$ParamFileWriter.writeOutputFile(ParameterFileWriteAction.java:170)
    at com.google.devtools.build.lib.exec.FileWriteStrategy.beginWriteOutputToFile(FileWriteStrategy.java:58)
    at com.google.devtools.build.lib.analysis.actions.FileWriteActionContext.beginWriteOutputToFile(FileWriteActionContext.java:49)
    at com.google.devtools.build.lib.analysis.actions.AbstractFileWriteAction.beginExecution(AbstractFileWriteAction.java:66)
    at com.google.devtools.build.lib.actions.Action.execute(Action.java:133)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$5.execute(SkyframeActionExecutor.java:907)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1076)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1031)
    at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:152)
    at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:91)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:492)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:856)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:349)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:169)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:590)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.String(byte[],byte) accessible: module java.base does not "opens java.lang" to unnamed module @63be40d8
    at java.base/java.lang.reflect.AccessibleObject.throwInaccessibleObjectException(AccessibleObject.java:387)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:363)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:311)
    at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:192)
    at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:185)
    at com.google.devtools.build.lib.unsafe.StringUnsafe.<init>(StringUnsafe.java:75)
    at com.google.devtools.build.lib.unsafe.StringUnsafe.initInstance(StringUnsafe.java:56)
    at com.google.devtools.build.lib.unsafe.StringUnsafe.<clinit>(StringUnsafe.java:37)
    ... 21 more
==> ERROR: A failure occurred in build().
mpeschel10 commented 1 year ago

@mpeschel10 Tried your build, also failed (but with different error) cry :

Thank you for the feedback! I'm sorry for the trouble. I got that error when my java version was too high; I think bazel 5.4 does something deprecated. Some suggestions:

# Confirm that PATH is being set properly in PKGBUILD:
grep PKGBUILD -e 'PATH="/usr/lib/jvm/java-11-openjdk/bin:$PATH"'
# Confirm that java 11 is installed:
sudo pacman -S java-environment=11
# Confirm that I guessed correctly for where the jvms are in your system:
[ -d /usr/lib/jvm ] && echo jvm path exists ok || echo jvm path does NOT exist
[ -d /usr/lib/jvm/java-11-openjdk ] && echo java 11 path exists ok || echo java 11 path does NOT exist
[ -d /usr/lib/jvm/java-11-openjdk/bin ] && echo java 11 bin path exists ok || echo java 11 bin does NOT exist

If I got the jvm dir wrong, please share the output of:

which java
ls -l /usr/lib/jvm
echo $PATH

I've also pushed a change to the PKGBUILD that should print your java and javac versions at build time. When you run makepkg, confirm you see something like:

==> Starting prepare()...
openjdk 11.0.19 2023-04-18
OpenJDK Runtime Environment (build 11.0.19+7)
OpenJDK 64-Bit Server VM (build 11.0.19+7, mixed mode)
javac 11.0.19
bazel 5.4.0

As a last resort, I suspect bazel caches your jvm somehow. Try sudo rm -rf ~/.cache/bazel. This will, unfortunately, restart the build from scratch. Good luck.

rustatian commented 1 year ago

Thanks for the update @mpeschel10 👍🏻

I tried the updated build, and it seems that the problem with your prev build was in my up-to-date JAVA env 😄.

I added /usr/lib/jvm/java-11-openjdk/bin to the $PATH, removed the jdk-openjdk package and everything seems to work fine. Unfortunately, the upstream ArchLinux tensorflow-rocm package seems to be broken due to the original problem (I also tried to build it one more time with the java-11 added to the $PATH). Would be cool if you could merge your changes upstream. 👍🏻

acxz commented 10 months ago

I'm going to close this issue as the conversation has derailed from the original issue (nothing wrong with that! bound to happen when something is broken for a long time)

@KalilovM the PKGBUILD has been updated, please try to build again and report back any errors as new issues (if they are not already listed in the currently open issues).