gigasquid / clojure-mxnet

Clojure Package for MXNET
69 stars 3 forks source link

Running on (Arch) Linux #1

Open r0man opened 6 years ago

r0man commented 6 years ago

Hi Carin,

this look great!

I had a couple of issues running the tests on Arch Linux. I guess most issues come from Arch Linux shipping more resent versions of the packages than what mxnet expects.

1.) I started with the CPU jar and got this error with Arch Linux's openblas package installed:

INFO  org.apache.mxnet.util.NativeLibraryLoader: Loading libmxnet-scala.so from /lib/native/ copying to mxnet-scala
ERROR org.apache.mxnet.util.NativeLibraryLoader: Couldn't load copied link file: java.lang.UnsatisfiedLinkError: /tmp/mxnet2218065962472922030/mxnet-scala: libopenblas.so.0: cannot open shared object file: No such file or directory
ERROR MXNetJVM: Couldn't find native library mxnet-scala
Exception in thread "main" java.lang.UnsatisfiedLinkError: /tmp/mxnet2218065962472922030/mxnet-scala: libopenblas.so.0: cannot open shared object file: No such file or directory, compiling:(base.clj:4:19)
        at clojure.lang.Compiler$StaticMethodExpr.eval(Compiler.java:1733)
        at clojure.lang.Compiler$DefExpr.eval(Compiler.java:457)
        at clojure.lang.Compiler.eval(Compiler.java:7067)
        at clojure.lang.Compiler.load(Compiler.java:7514)
        at clojure.lang.RT.loadResourceScript(RT.java:379)
        at clojure.lang.RT.loadResourceScript(RT.java:370)
        at clojure.lang.RT.load(RT.java:460)
        at clojure.lang.RT.load(RT.java:426)
        at clojure.core$load$fn__6548.invoke(core.clj:6046)
        at clojure.core$load.invokeStatic(core.clj:6045)
        at clojure.core$load.doInvoke(core.clj:6029)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.core$load_one.invokeStatic(core.clj:5848)
        at clojure.core$load_one.invoke(core.clj:5843)
        at clojure.core$load_lib$fn__6493.invoke(core.clj:5888)
        at clojure.core$load_lib.invokeStatic(core.clj:5887)
        at clojure.core$load_lib.doInvoke(core.clj:5868)
        at clojure.lang.RestFn.applyTo(RestFn.java:142)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$load_libs.invokeStatic(core.clj:5925)
        at clojure.core$load_libs.doInvoke(core.clj:5909)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$require.invokeStatic(core.clj:5947)
        at clojure.core$require.doInvoke(core.clj:5947)
        at clojure.lang.RestFn.invoke(RestFn.java:551)
        at org.apache.clojure_mxnet.io$eval2976$loading__6434__auto____2977.invoke(io.clj:1)
        at org.apache.clojure_mxnet.io$eval2976.invokeStatic(io.clj:1)
        at org.apache.clojure_mxnet.io$eval2976.invoke(io.clj:1)
        at clojure.lang.Compiler.eval(Compiler.java:7062)
        at clojure.lang.Compiler.eval(Compiler.java:7051)
        at clojure.lang.Compiler.load(Compiler.java:7514)
        at clojure.lang.RT.loadResourceScript(RT.java:379)
        at clojure.lang.RT.loadResourceScript(RT.java:370)
        at clojure.lang.RT.load(RT.java:460)
        at clojure.lang.RT.load(RT.java:426)
        at clojure.core$load$fn__6548.invoke(core.clj:6046)
        at clojure.core$load.invokeStatic(core.clj:6045)
        at clojure.core$load.doInvoke(core.clj:6029)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.core$load_one.invokeStatic(core.clj:5848)
        at clojure.core$load_one.invoke(core.clj:5843)
        at clojure.core$load_lib$fn__6493.invoke(core.clj:5888)
        at clojure.core$load_lib.invokeStatic(core.clj:5887)
        at clojure.core$load_lib.doInvoke(core.clj:5868)
        at clojure.lang.RestFn.applyTo(RestFn.java:142)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$load_libs.invokeStatic(core.clj:5925)
        at clojure.core$load_libs.doInvoke(core.clj:5909)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$require.invokeStatic(core.clj:5947)
        at clojure.core$require.doInvoke(core.clj:5947)
        at clojure.lang.RestFn.invoke(RestFn.java:703)
        at org.apache.clojure_mxnet.conv_test$eval351$loading__6434__auto____352.invoke(conv_test.clj:1)
        at org.apache.clojure_mxnet.conv_test$eval351.invokeStatic(conv_test.clj:1)
        at org.apache.clojure_mxnet.conv_test$eval351.invoke(conv_test.clj:1)
        at clojure.lang.Compiler.eval(Compiler.java:7062)
        at clojure.lang.Compiler.eval(Compiler.java:7051)
        at clojure.lang.Compiler.load(Compiler.java:7514)
        at clojure.lang.RT.loadResourceScript(RT.java:379)
        at clojure.lang.RT.loadResourceScript(RT.java:370)
        at clojure.lang.RT.load(RT.java:460)
        at clojure.lang.RT.load(RT.java:426)
        at clojure.core$load$fn__6548.invoke(core.clj:6046)
        at clojure.core$load.invokeStatic(core.clj:6045)
        at clojure.core$load.doInvoke(core.clj:6029)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.core$load_one.invokeStatic(core.clj:5848)
        at clojure.core$load_one.invoke(core.clj:5843)
        at clojure.core$load_lib$fn__6493.invoke(core.clj:5888)
        at clojure.core$load_lib.invokeStatic(core.clj:5887)
        at clojure.core$load_lib.doInvoke(core.clj:5868)
        at clojure.lang.RestFn.applyTo(RestFn.java:142)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$load_libs.invokeStatic(core.clj:5925)
        at clojure.core$load_libs.doInvoke(core.clj:5909)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$require.invokeStatic(core.clj:5947)
        at clojure.core$require.doInvoke(core.clj:5947)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.core$apply.invokeStatic(core.clj:659)
        at clojure.core$apply.invoke(core.clj:652)
        at user$eval233.invokeStatic(form-init604851786482125357.clj:1)
        at user$eval233.invoke(form-init604851786482125357.clj:1)
        at clojure.lang.Compiler.eval(Compiler.java:7062)
        at clojure.lang.Compiler.eval(Compiler.java:7052)
        at clojure.lang.Compiler.load(Compiler.java:7514)
        at clojure.lang.Compiler.loadFile(Compiler.java:7452)
        at clojure.main$load_script.invokeStatic(main.clj:278)
        at clojure.main$init_opt.invokeStatic(main.clj:280)
        at clojure.main$init_opt.invoke(main.clj:280)
        at clojure.main$initialize.invokeStatic(main.clj:311)
        at clojure.main$null_opt.invokeStatic(main.clj:345)
        at clojure.main$null_opt.invoke(main.clj:342)
        at clojure.main$main.invokeStatic(main.clj:424)
        at clojure.main$main.doInvoke(main.clj:387)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.lang.Var.applyTo(Var.java:702)
        at clojure.main.main(main.java:37)
Caused by: java.lang.UnsatisfiedLinkError: /tmp/mxnet2218065962472922030/mxnet-scala: libopenblas.so.0: cannot open shared object file: No such file or directory
        at java.lang.ClassLoader$NativeLibrary.load(Native Method)
        at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
        at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
        at java.lang.Runtime.load0(Runtime.java:809)
        at java.lang.System.load(System.java:1086)
        at org.apache.mxnet.util.NativeLibraryLoader$.loadLibraryFromStream(NativeLibraryLoader.scala:140)
        at org.apache.mxnet.util.NativeLibraryLoader$.loadLibrary(NativeLibraryLoader.scala:93)
        at org.apache.mxnet.Base$.<init>(Base.scala:70)
        at org.apache.mxnet.Base$.<clinit>(Base.scala)
        at org.apache.mxnet.Base.MX_REAL_TYPE(Base.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
        at clojure.lang.Compiler$StaticMethodExpr.eval(Compiler.java:1726)
        ... 100 more

Installing the openblas-lapack package from AUR fixed this.

yaourt -S openblas-lapack

2.) Next up was some issue with libcurl that did not provide the CURL_OPENSSL_3 symbol.

Exception in thread "main" java.lang.UnsatisfiedLinkError: /tmp/mxnet8907768921243606817/mxnet-scala: /usr/lib/libcurl.so.4: version `CURL_OPENSSL_3' not found (required by /tmp/mxnet8907768921243606817/mxnet-scala), compiling:(base.clj:4:19)
    at clojure.lang.Compiler$StaticMethodExpr.eval(Compiler.java:1733)
    at clojure.lang.Compiler$DefExpr.eval(Compiler.java:457)
    at clojure.lang.Compiler.eval(Compiler.java:7067)
    at clojure.lang.Compiler.load(Compiler.java:7514)
    at clojure.lang.RT.loadResourceScript(RT.java:379)
    at clojure.lang.RT.loadResourceScript(RT.java:370)
    at clojure.lang.RT.load(RT.java:460)
    at clojure.lang.RT.load(RT.java:426)
    at clojure.core$load$fn__6548.invoke(core.clj:6046)
    at clojure.core$load.invokeStatic(core.clj:6045)
    at clojure.core$load.doInvoke(core.clj:6029)
    at clojure.lang.RestFn.invoke(RestFn.java:408)
    at clojure.core$load_one.invokeStatic(core.clj:5848)
    at clojure.core$load_one.invoke(core.clj:5843)
    at clojure.core$load_lib$fn__6493.invoke(core.clj:5888)
    at clojure.core$load_lib.invokeStatic(core.clj:5887)
    at clojure.core$load_lib.doInvoke(core.clj:5868)
    at clojure.lang.RestFn.applyTo(RestFn.java:142)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$load_libs.invokeStatic(core.clj:5925)
    at clojure.core$load_libs.doInvoke(core.clj:5909)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$require.invokeStatic(core.clj:5947)
    at clojure.core$require.doInvoke(core.clj:5947)
    at clojure.lang.RestFn.invoke(RestFn.java:551)
    at org.apache.clojure_mxnet.io$eval2976$loading__6434__auto____2977.invoke(io.clj:1)
    at org.apache.clojure_mxnet.io$eval2976.invokeStatic(io.clj:1)
    at org.apache.clojure_mxnet.io$eval2976.invoke(io.clj:1)
    at clojure.lang.Compiler.eval(Compiler.java:7062)
    at clojure.lang.Compiler.eval(Compiler.java:7051)
    at clojure.lang.Compiler.load(Compiler.java:7514)
    at clojure.lang.RT.loadResourceScript(RT.java:379)
    at clojure.lang.RT.loadResourceScript(RT.java:370)
    at clojure.lang.RT.load(RT.java:460)
    at clojure.lang.RT.load(RT.java:426)
    at clojure.core$load$fn__6548.invoke(core.clj:6046)
    at clojure.core$load.invokeStatic(core.clj:6045)
    at clojure.core$load.doInvoke(core.clj:6029)
    at clojure.lang.RestFn.invoke(RestFn.java:408)
    at clojure.core$load_one.invokeStatic(core.clj:5848)
    at clojure.core$load_one.invoke(core.clj:5843)
    at clojure.core$load_lib$fn__6493.invoke(core.clj:5888)
    at clojure.core$load_lib.invokeStatic(core.clj:5887)
    at clojure.core$load_lib.doInvoke(core.clj:5868)
    at clojure.lang.RestFn.applyTo(RestFn.java:142)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$load_libs.invokeStatic(core.clj:5925)
    at clojure.core$load_libs.doInvoke(core.clj:5909)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$require.invokeStatic(core.clj:5947)
    at clojure.core$require.doInvoke(core.clj:5947)
    at clojure.lang.RestFn.invoke(RestFn.java:703)
    at org.apache.clojure_mxnet.conv_test$eval351$loading__6434__auto____352.invoke(conv_test.clj:1)
    at org.apache.clojure_mxnet.conv_test$eval351.invokeStatic(conv_test.clj:1)
    at org.apache.clojure_mxnet.conv_test$eval351.invoke(conv_test.clj:1)
    at clojure.lang.Compiler.eval(Compiler.java:7062)
    at clojure.lang.Compiler.eval(Compiler.java:7051)
    at clojure.lang.Compiler.load(Compiler.java:7514)
    at clojure.lang.RT.loadResourceScript(RT.java:379)
    at clojure.lang.RT.loadResourceScript(RT.java:370)
    at clojure.lang.RT.load(RT.java:460)
    at clojure.lang.RT.load(RT.java:426)
    at clojure.core$load$fn__6548.invoke(core.clj:6046)
    at clojure.core$load.invokeStatic(core.clj:6045)
    at clojure.core$load.doInvoke(core.clj:6029)
    at clojure.lang.RestFn.invoke(RestFn.java:408)
    at clojure.core$load_one.invokeStatic(core.clj:5848)
    at clojure.core$load_one.invoke(core.clj:5843)
    at clojure.core$load_lib$fn__6493.invoke(core.clj:5888)
    at clojure.core$load_lib.invokeStatic(core.clj:5887)
    at clojure.core$load_lib.doInvoke(core.clj:5868)
    at clojure.lang.RestFn.applyTo(RestFn.java:142)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$load_libs.invokeStatic(core.clj:5925)
    at clojure.core$load_libs.doInvoke(core.clj:5909)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$require.invokeStatic(core.clj:5947)
    at clojure.core$require.doInvoke(core.clj:5947)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at clojure.core$apply.invokeStatic(core.clj:659)
    at clojure.core$apply.invoke(core.clj:652)
    at user$eval233.invokeStatic(form-init1235312774279813685.clj:1)
    at user$eval233.invoke(form-init1235312774279813685.clj:1)
    at clojure.lang.Compiler.eval(Compiler.java:7062)
    at clojure.lang.Compiler.eval(Compiler.java:7052)
    at clojure.lang.Compiler.load(Compiler.java:7514)
    at clojure.lang.Compiler.loadFile(Compiler.java:7452)
    at clojure.main$load_script.invokeStatic(main.clj:278)
    at clojure.main$init_opt.invokeStatic(main.clj:280)
    at clojure.main$init_opt.invoke(main.clj:280)
    at clojure.main$initialize.invokeStatic(main.clj:311)
    at clojure.main$null_opt.invokeStatic(main.clj:345)
    at clojure.main$null_opt.invoke(main.clj:342)
    at clojure.main$main.invokeStatic(main.clj:424)
    at clojure.main$main.doInvoke(main.clj:387)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at clojure.lang.Var.applyTo(Var.java:702)
    at clojure.main.main(main.java:37)
Caused by: java.lang.UnsatisfiedLinkError: /tmp/mxnet8907768921243606817/mxnet-scala: /usr/lib/libcurl.so.4: version `CURL_OPENSSL_3' not found (required by /tmp/mxnet8907768921243606817/mxnet-scala)
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
    at java.lang.Runtime.load0(Runtime.java:809)
    at java.lang.System.load(System.java:1086)
    at org.apache.mxnet.util.NativeLibraryLoader$.loadLibraryFromStream(NativeLibraryLoader.scala:140)
    at org.apache.mxnet.util.NativeLibraryLoader$.loadLibrary(NativeLibraryLoader.scala:93)
    at org.apache.mxnet.Base$.<init>(Base.scala:70)
    at org.apache.mxnet.Base$.<clinit>(Base.scala)
    at org.apache.mxnet.Base.MX_REAL_TYPE(Base.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
    at clojure.lang.Compiler$StaticMethodExpr.eval(Compiler.java:1726)
    ... 100 more

Installing libcurl-compat and overriding LD_PRELOAD fixed this.

yaourt -S libcurl-compat
export LD_PRELOAD=libcurl.so.3

After this the tests passed with the CPU jar. :)

3.) Now for the GPU jar. Arch Linux ships with cuda 9.2. Downgrading the package to 9.0 made the tests pass with the GPU JAR as well.

wget https://archive.archlinux.org/packages/c/cuda/cuda-9.0.176-4-x86_64.pkg.tar.xz
sudo pacman -U cuda-9.0.176-4-x86_64.pkg.tar.xz

I guess those dependency issues are mostly specific to Arch Linux, or need to be fixed in mxnet itself.

I tried playing around with your clojure-package before, but had a hard time getting mxnet installed. So I thought I post this here, maybe these instruction can save other people some time.

Now I'm looking forward to start a REPL ...

Greetings, r0man.

gigasquid commented 6 years ago

Thanks so much for testing and documenting this @r0man. It's very useful and I'm sure will help other people. 😸

gigasquid commented 6 years ago

@r0man I added you in the Special Thanks section of the README. If you have any edits/changes - please let me know 😸

r0man commented 6 years ago

@gigasquid Thanks, everything good! :)

extremenelson commented 6 years ago

I too use Arch Linux and had to use the libcurl-compat to everything working. But I am using the latest cuda version (community/cuda 9.2.148-1) and the tests are passing so far.

tribbloid commented 5 years ago

I have the same error on Manjaro & mxnet-scala (they should share the same JVM low level API implementations). Plus once I switched to gpu(CUDA 10.0) I got a similar error instead:

Exception in thread "main" java.lang.UnsatisfiedLinkError: /tmp/mxnet6582312094578976653/mxnet-scala: libcudart.so.9.0: cannot open shared object file: No such file or directory
    at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
    at java.lang.Runtime.load0(Runtime.java:809)
    at java.lang.System.load(System.java:1086)
    at org.apache.mxnet.util.NativeLibraryLoader$.loadLibraryFromStream(NativeLibraryLoader.scala:140)
    at org.apache.mxnet.util.NativeLibraryLoader$.loadLibrary(NativeLibraryLoader.scala:93)
    at org.apache.mxnet.Base$.<init>(Base.scala:70)
    at org.apache.mxnet.Base$.<clinit>(Base.scala)
    at org.apache.mxnet.NDArray$.initNDArrayModule(NDArray.scala:159)
    at org.apache.mxnet.NDArray$.<init>(NDArray.scala:39)
    at org.apache.mxnet.NDArray$.<clinit>(NDArray.scala)
...
tribbloid commented 5 years ago

I haven't tested py-mxnet as mxnet-cu100 hasn't been released. but all other DNN libraries (pytorch, tensorflow) have no problem handling the same platform.

these are my library files:

openblas:

/usr/include/f77blas.h
/usr/include/openblas_config.h
/usr/lib/cmake/openblas/OpenBLASConfig.cmake
/usr/lib/cmake/openblas/OpenBLASConfigVersion.cmake
/usr/lib/libblas.so
/usr/lib/libblas.so.3
/usr/lib/libopenblas.so
/usr/lib/libopenblas.so.3
/usr/lib/libopenblasp-r0.3.3.so
/usr/lib/pkgconfig/openblas.pc
/usr/share/licenses/openblas/LICENSE

cuda:

...

/opt/cuda/lib64/libaccinj64.so.10.0.130
/opt/cuda/lib64/libcublas.so
/opt/cuda/lib64/libcublas.so.10.0
/opt/cuda/lib64/libcublas.so.10.0.130
/opt/cuda/lib64/libcublas_static.a
/opt/cuda/lib64/libcudadevrt.a
/opt/cuda/lib64/libcudart.so
/opt/cuda/lib64/libcudart.so.10.0
/opt/cuda/lib64/libcudart.so.10.0.130
/opt/cuda/lib64/libcudart_static.a
/opt/cuda/lib64/libcufft.so
/opt/cuda/lib64/libcufft.so.10.0
/opt/cuda/lib64/libcufft.so.10.0.145
/opt/cuda/lib64/libcufft_static.a
/opt/cuda/lib64/libcufft_static_nocallback.a
...