eaplatanios / tensorflow_scala

TensorFlow API for the Scala Programming Language
http://platanios.org/tensorflow_scala/
Apache License 2.0
936 stars 96 forks source link

Support various versions of cuda #162

Open langley opened 5 years ago

langley commented 5 years ago

I'm exploring using your package on google's colaboratory by leveraging almond.sh The non-gpu version works fine, THANKS this is awesome!

But when I try and use the linux-gpu-x86_64 version, I get the following error:

java.lang.UnsatisfiedLinkError: /tmp/tensorflow_scala_native_libraries13086553369265287406/libtensorflow_jni.so: libcublas.so.9.0: cannot open shared object file: No such file or directory

I believe this is similar to tensor flow issue #15604

When I checked what was installed under /usr/local I see ... lrwxrwxrwx 1 root root 9 Apr 4 20:13 cuda -> cuda-10.0 drwxr-xr-x 1 root root 4096 Apr 4 20:11 cuda-10.0 ...

My guess is that the "fix" for this is to compile tensorflow from source against that version of the cuda libraries. So I have a couple of questions before I embark on that.

1) would this "compile from source" work? 2) would the transition to "TF 2.0" cause problems? I don't know the details of the tensorflow APIS and tensorflow_scala well enough yet to know if the changes for TF 2.0 will cause significant problems for tensorflow_scala.

DirkToewe commented 5 years ago

Hi @langley

  1. Yeah recompilation from source should work. As far as i know, TF4S only depends on the C API libtensorflow.so so You should be able to compile Tensorflow with different versions of Cuda or even SyCl or ROCm.
  2. The C API in TF2.0 is very likely backwards compatible. So TF4S should work with it as well. The TF4S API however is going to remain the same. Gradients in eager mode for example are not yet supported.

Sadly, compiling TF from sources with Cuda is a quite a bit of trial and error. Here are some things I learned the hard way:

jxtps commented 4 years ago

Would it make sense for TF4S to use JavaCPP Presets for TensorFlow?

They have convenient packaging of all the binaries for several platforms (linux, mac, windows), and GPU support is just a maven include away.

I realize that the C++ Tensorflow is different from the C API, but they have some instructions for how to create new presets.

nazarblch commented 4 years ago

JavaCPP requires a lot of handwork. In most cases it is difficult to generate suitable Java interfaces. I have used JavaCPP for PyTorch api, you may find it here: https://github.com/nazarblch/torch-scala