malmaud / TensorFlow.jl

A Julia wrapper for TensorFlow
Other
884 stars 110 forks source link

Julia 0.6 cannot see custom libtensorflow.so #340

Open colbec opened 7 years ago

colbec commented 7 years ago

In an effort to eliminate warnings about different versions of tensorflow in Python and Julia I updated my Python3 tensorflow to the latest nightly and it reports TF version 1.5. I then rebuilt latest TF source using bazel to profit from local MKL resources and bazel built the libtensorflow.so. I then dropped this new .so into the TensorFlow/deps/usr/bin per the instructions. The result is now:

julia> Pkg.test("TensorFlow")
INFO: Testing TensorFlow
ERROR: LoadError: LoadError: could not load library "/home/colin/.julia/v0.6/TensorFlow/src/../deps/usr/bin/libtensorflow"
libtensorflow_framework.so: cannot open shared object file: No such file or directory
Stacktrace:
 [1] dlopen(::String, ::UInt32) at ./libdl.jl:97
 [2] TensorFlow.Graph() at /home/colin/.julia/v0.6/TensorFlow/src/core.jl:21
 [3] include_from_node1(::String) at ./loading.jl:569
 [4] include(::String) at ./sysimg.jl:14
 [5] include_from_node1(::String) at ./loading.jl:569
 [6] include(::String) at ./sysimg.jl:14
 [7] process_options(::Base.JLOptions) at ./client.jl:305
 [8] _start() at ./client.jl:371
while loading /home/colin/.julia/v0.6/TensorFlow/test/../examples/logistic.jl, in expression starting on line 22
while loading /home/colin/.julia/v0.6/TensorFlow/test/runtests.jl, in expression starting on line 6
=============================[ ERROR: TensorFlow ]==============================

failed process: Process(`/home/colin/Downloads/julia/bin/julia -Cx86-64 -J/home/colin/Downloads/julia/lib/julia/sys.so --compile=yes --depwarn=yes --check-bounds=yes --code-coverage=none --color=yes --compilecache=yes /home/colin/.julia/v0.6/TensorFlow/test/runtests.jl`, ProcessExited(1)) [1]

================================================================================
ERROR: TensorFlow had test errors

The files:

colin@linux-k18k:~> ls /home/colin/.julia/v0.6/TensorFlow/src/../deps/usr/bin/libtensorflow*
/home/colin/.julia/v0.6/TensorFlow/src/../deps/usr/bin/libtensorflow.so
/home/colin/.julia/v0.6/TensorFlow/src/../deps/usr/bin/libtensorflow.so.old
/home/colin/.julia/v0.6/TensorFlow/src/../deps/usr/bin/libtensorflow.so.old2
julia> versioninfo()
Julia Version 0.6.0
Commit 9036443 (2017-06-19 13:05 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i5-4460  CPU @ 3.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
haldai commented 7 years ago

I've just met this problem, too. There is one temporary solution: copy the libtensorflow_framework.so from your python site-packages directory (e.g. /usr/lib/python3.6/site-packages/tensorflow/) to the ~/.julia/v0.6/TensorFlow/deps/usr/bin/ solves the problem.

malmaud commented 7 years ago

So it does see your libtensorflow.so, but if that library in turn has a relative path to another dependent library, that will trigger that error. It used to be the case that libtensorflow.so was self-contained, so if that's no longer the case, the instructions will have to be updated.

A more robust way then copying files is to just set the LIBTENSORFLOW environment variable to point to your libtensorflow.so. I should update the instructions to mention that.

colbec commented 7 years ago

OK thanks @haldai for this suggestion which gets my TF working again.

@malmaud I did actually try that suggestion just now of the ENV["LIBTENSORFLOW"] pointing directly at the bazel-bin/tensorflow output but for some reason it did not like that either.

malmaud commented 7 years ago

Can you post the error? I'd like to get that working. On Sat, Oct 21, 2017 at 12:45 PM Colin Beckingham notifications@github.com wrote:

OK thanks @haldai https://github.com/haldai for this suggestion which gets my TF working again.

@malmaud https://github.com/malmaud I did actually try that suggestion just now of the ENV["LIBTENSORFLOW"] pointing directly at the bazel-bin/tensorflow output but for some reason it did not like that either.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/malmaud/TensorFlow.jl/issues/340#issuecomment-338415752, or mute the thread https://github.com/notifications/unsubscribe-auth/AA8Svb9X-8xreSziHVt1NpT1SNp3NaX3ks5suh-YgaJpZM4QBiMN .

colbec commented 7 years ago

Here is straight TF with both libtensorflow.so and libtensorflow_framework.so in deps/usr/bin:

julia> Pkg.test("TensorFlow")
INFO: Testing TensorFlow
INFO: Checkpoint files saved in /tmp/tmpZdnOW6
Current loss is 230.42.
Current loss is 228.94.
Current loss is 227.46.
Current loss is 226.00.
... lots more, but clearly working.

Here is the ENV route:

julia> ENV["LIBTENSORFLOW"]="/home/colin/Downloads/tensorflow/bazel-bin/tensorflow/"
"/home/colin/Downloads/tensorflow/bazel-bin/tensorflow/"

julia> using TensorFlow

julia> Pkg.test("TensorFlow")
INFO: Testing TensorFlow
ERROR: LoadError: LoadError: could not load library "/home/colin/Downloads/tensorflow/bazel-bin/tensorflow/"
/home/colin/Downloads/tensorflow/bazel-bin/tensorflow/.so: cannot open shared object file: No such file or directory
Stacktrace:

and the rest as in the original post. If you need anything else let me know.

colbec commented 7 years ago

A small perhaps related issue is that once an ENV["LIBTENSORFLOW"] setting is in effect, it seems hard to reset it. For example if I am running correctly with a libtensorflowand _framework .so files loaded, then try to set ENV["LIBTENSORFLOW"] to a different string, find that testing TF fails, and then try to reset Julia back to the original setup, setting the ENV["LIBTENSORFLOW"] to an empty string or nothing does not restore the original default setup. We have to restart Julia - not a hardship, just something to be aware of.

expnn commented 7 years ago

@colbec I've just met the same problem. I fixed it by setting the environment variable LIBTENSORFLOW. The environment variable LIBTENSORFLOW should set to be the absolute path to the .so file, not the directory contains libtensorflow.so. In your case, it should be

julia>  ENV["LIBTENSORFLOW"]="/home/colin/Downloads/tensorflow/bazel-bin/tensorflow/"
"/home/colin/Downloads/tensorflow/bazel-bin/tensorflow/libtensorflow.so"

You can set this variable in one of the .bash_profile, .bashrc, or .profile files. see http://malmaud.github.io/TensorFlow.jl/latest/build_from_source.html#Step-2:-Install-the-TensorFlow-binary-1 for detail.