rocm-arch / tensorflow-rocm

tensorflow-rocm AUR package
17 stars 12 forks source link

Error while installing tensorflow-rocm #52

Closed KalilovM closed 1 year ago

KalilovM commented 1 year ago
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: in cc_library rule //tensorflow/compiler/xla/pjrt:pjrt_future: target '@tf_runtime//:support' is not visible from target '//tensorflow/compiler/xla/pjrt:pjrt_future'. Check the visibility declaration of the former target if you think the dependency is legitimate
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: Analysis of target '//tensorflow/compiler/xla/pjrt:pjrt_future' failed
ERROR: Analysis of target '//tensorflow:libtensorflow.so' failed; build aborted:
INFO: Elapsed time: 12.645s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (541 packages loaded\
, 23635 targets configured)
    currently loading: @arm_neon_2_x86_sse// ... (11 packages)
==> ERROR: A failure occurred in build().
    Aborting...
 -> error making: tensorflow-rocm-exit status 4
checking dependencies...

6.3.7-arch1-1 python-3.11.3

mpeschel10 commented 1 year ago

Hi. Your error is as far as I ever got compiling this PKGBUILD.

Instead, I wrote my own PKGBUILD for tensorflow-rocm based on the official docker image. If you're interested, try:

git clone https://aur.archlinux.org/tensorflow-amd-git.git
cd tensorflow-amd-git
makepkg -s
pacman -U tensorflow-amd-git*.pkg.tar.zst
pacman -U python-tensorflow-amd-git*.pkg.tar.zst
python test.py

Please let me know how it goes ; if my build works for other people, I will try to get the changes merged in this repository. Edit: Apparently acxz intended for this to be eventually merged with extra repository tensorflow PKGBUILD. So my stuff, which uses the tensorflow-rocm upstream, probably will not make it in.

Edit 2: Actually, I've been trying to reproduce this, and I can't anymore. Could you post the output of pacman -Qtt and any modifications you've made to the PKGBUILD?

rustatian commented 1 year ago

Hey 👋🏻 Got the similar error on clean install:

INFO: Found applicable config definition build:dynamic_kernels in file /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/.bazelrc: --define=dynamic_loaded_kernels=true --copt=-DAUTOLOAD_DYNAMIC_KERNELS
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: in cc_library rule //tensorflow/compiler/xla/pjrt:pjrt_future: target '@tf_runtime//:support' is not visible from target '//tensorflow/compiler/xla/pjrt:pjrt_future'. Check the visibility declaration of the former target if you think the dependency is legitimate
ERROR: /home/user/.cache/yay/tensorflow-rocm/src/tensorflow-2.12.0-rocm/tensorflow/compiler/xla/pjrt/BUILD:469:11: Analysis of target '//tensorflow/compiler/xla/pjrt:pjrt_future' failed

ERROR: Analysis of target '//tensorflow:libtensorflow.so' failed; build aborted:
rustatian commented 1 year ago

@mpeschel10 Tried your build, also failed (but with different error) 😢 :

FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.ExceptionInInitializerError
    at com.google.devtools.build.lib.actions.ParameterFile.writeContent(ParameterFile.java:118)
    at com.google.devtools.build.lib.actions.ParameterFile.writeParameterFile(ParameterFile.java:111)
    at com.google.devtools.build.lib.analysis.actions.ParameterFileWriteAction$ParamFileWriter.writeOutputFile(ParameterFileWriteAction.java:170)
    at com.google.devtools.build.lib.exec.FileWriteStrategy.beginWriteOutputToFile(FileWriteStrategy.java:58)
    at com.google.devtools.build.lib.analysis.actions.FileWriteActionContext.beginWriteOutputToFile(FileWriteActionContext.java:49)
    at com.google.devtools.build.lib.analysis.actions.AbstractFileWriteAction.beginExecution(AbstractFileWriteAction.java:66)
    at com.google.devtools.build.lib.actions.Action.execute(Action.java:133)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$5.execute(SkyframeActionExecutor.java:907)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1076)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1031)
    at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:152)
    at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:91)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:492)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:856)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:349)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:169)
    at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:590)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1623)
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make java.lang.String(byte[],byte) accessible: module java.base does not "opens java.lang" to unnamed module @63be40d8
    at java.base/java.lang.reflect.AccessibleObject.throwInaccessibleObjectException(AccessibleObject.java:387)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:363)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:311)
    at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:192)
    at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:185)
    at com.google.devtools.build.lib.unsafe.StringUnsafe.<init>(StringUnsafe.java:75)
    at com.google.devtools.build.lib.unsafe.StringUnsafe.initInstance(StringUnsafe.java:56)
    at com.google.devtools.build.lib.unsafe.StringUnsafe.<clinit>(StringUnsafe.java:37)
    ... 21 more
==> ERROR: A failure occurred in build().
mpeschel10 commented 1 year ago

@mpeschel10 Tried your build, also failed (but with different error) cry :

Thank you for the feedback! I'm sorry for the trouble. I got that error when my java version was too high; I think bazel 5.4 does something deprecated. Some suggestions:

# Confirm that PATH is being set properly in PKGBUILD:
grep PKGBUILD -e 'PATH="/usr/lib/jvm/java-11-openjdk/bin:$PATH"'
# Confirm that java 11 is installed:
sudo pacman -S java-environment=11
# Confirm that I guessed correctly for where the jvms are in your system:
[ -d /usr/lib/jvm ] && echo jvm path exists ok || echo jvm path does NOT exist
[ -d /usr/lib/jvm/java-11-openjdk ] && echo java 11 path exists ok || echo java 11 path does NOT exist
[ -d /usr/lib/jvm/java-11-openjdk/bin ] && echo java 11 bin path exists ok || echo java 11 bin does NOT exist

If I got the jvm dir wrong, please share the output of:

which java
ls -l /usr/lib/jvm
echo $PATH

I've also pushed a change to the PKGBUILD that should print your java and javac versions at build time. When you run makepkg, confirm you see something like:

==> Starting prepare()...
openjdk 11.0.19 2023-04-18
OpenJDK Runtime Environment (build 11.0.19+7)
OpenJDK 64-Bit Server VM (build 11.0.19+7, mixed mode)
javac 11.0.19
bazel 5.4.0

As a last resort, I suspect bazel caches your jvm somehow. Try sudo rm -rf ~/.cache/bazel. This will, unfortunately, restart the build from scratch. Good luck.

rustatian commented 1 year ago

Thanks for the update @mpeschel10 👍🏻

I tried the updated build, and it seems that the problem with your prev build was in my up-to-date JAVA env 😄.

I added /usr/lib/jvm/java-11-openjdk/bin to the $PATH, removed the jdk-openjdk package and everything seems to work fine. Unfortunately, the upstream ArchLinux tensorflow-rocm package seems to be broken due to the original problem (I also tried to build it one more time with the java-11 added to the $PATH). Would be cool if you could merge your changes upstream. 👍🏻

acxz commented 1 year ago

I'm going to close this issue as the conversation has derailed from the original issue (nothing wrong with that! bound to happen when something is broken for a long time)

@KalilovM the PKGBUILD has been updated, please try to build again and report back any errors as new issues (if they are not already listed in the currently open issues).