uncomplicate / neanderthal

Fast Clojure Matrix Library
http://neanderthal.uncomplicate.org
Eclipse Public License 1.0
1.06k stars 56 forks source link

Hello World run with MKL libraries and $LD_LIBRARY_PATH #57

Closed LazerJesus closed 5 years ago

LazerJesus commented 5 years ago

I copied the Hello World example, but when executing it with a REPL it throws an error.

project.clj

(defproject basic "0.1.0-SNAPSHOT"
  :dependencies [[org.clojure/clojure "1.10.0"]
                 [uncomplicate/neanderthal "0.22.1"]]
  :exclusions [[org.jcuda/jcuda-natives :classifier "apple-x86_64"]
               [org.jcuda/jcublas-natives :classifier "apple-x86_64"]])

core.clj

(ns basic.core
  (:use [uncomplicate.neanderthal core native]))

(dv 1 2 3)

repl:

Error printing return value (NoClassDefFoundError) at uncomplicate.neanderthal.internal.host.mkl.DoubleVectorEngine/iamax (mkl.clj:395).
Could not initialize class uncomplicate.neanderthal.internal.host.CBLASj

system:

$ lein version
Leiningen 2.9.1 on Java 11.0.2 OpenJDK 64-Bit Server VM
LazerJesus commented 5 years ago

ps. clojure newbie here ;)

blueberry commented 5 years ago

Hi FinnFrotscher. Since you haven't pasted the whole stack trace, I cannot be sure what is the problem, but it seems to me that you hadn't installed Intel MKL. Please follow the getting started guide. If you are an absolute beginner it is safer to clone the whole project, than to copy/paste it.

blueberry commented 5 years ago
LazerJesus commented 5 years ago

hi, thank you for the fast answer.

this is the entire log as it is pushed to the repl buffer.

Error printing return value (NoClassDefFoundError) at uncomplicate.neanderthal.internal.host.mkl.DoubleVectorEngine/iamax (mkl.clj:395).
Could not initialize class uncomplicate.neanderthal.internal.host.CBLAS

The cider-error stacktrace:

1. Caused by java.lang.NoClassDefFoundError
   Could not initialize class uncomplicate.neanderthal.internal.host.CBLAS

                   mkl.clj:  395  uncomplicate.neanderthal.internal.host.mkl.DoubleVectorEngine/iamax
                   mkl.clj:  419  uncomplicate.neanderthal.internal.host.mkl.DoubleVectorEngine/amax
              printing.clj:  103  uncomplicate.neanderthal.internal.printing/eval18726/print-vector
          buffer_block.clj:  711  uncomplicate.neanderthal.internal.host.buffer-block/eval19637/fn
              MultiFn.java:  234  clojure.lang.MultiFn/invoke
                pprint.clj:   40  cider.nrepl.pprint/pr/fn
                  AFn.java:  152  clojure.lang.AFn/applyToHelper
                  AFn.java:  144  clojure.lang.AFn/applyTo
                  core.clj:  665  clojure.core/apply
                  core.clj: 1973  clojure.core/with-bindings*
                  core.clj: 1973  clojure.core/with-bindings*
               RestFn.java:  425  clojure.lang.RestFn/invoke
                pprint.clj:   37  cider.nrepl.pprint/pr
                pprint.clj:   29  cider.nrepl.pprint/pr
                  Var.java:  393  clojure.lang.Var/invoke
                 print.clj:  224  nrepl.middleware.print/wrap-print/fn/print
                 print.clj:  148  nrepl.middleware.print/send-nonstreamed/print-key/fn
                 print.clj:  147  nrepl.middleware.print/send-nonstreamed/print-key
                  core.clj: 2742  clojure.core/map/fn/fn
             protocols.clj:   49  clojure.core.protocols/iter-reduce
             protocols.clj:   75  clojure.core.protocols/fn
             protocols.clj:   75  clojure.core.protocols/fn
             protocols.clj:   13  clojure.core.protocols/fn/G
                  core.clj: 6884  clojure.core/transduce
                  core.clj: 6870  clojure.core/transduce
                 print.clj:  156  nrepl.middleware.print/send-nonstreamed
                 print.clj:  138  nrepl.middleware.print/send-nonstreamed
                 print.clj:  174  nrepl.middleware.print/printing-transport/reify
                caught.clj:   58  nrepl.middleware.caught/caught-transport/reify
           track_state.clj:  214  cider.nrepl.middleware.track-state/make-transport/reify
    interruptible_eval.clj:  114  nrepl.middleware.interruptible-eval/evaluate/fn
                  main.clj:  419  clojure.main/repl/read-eval-print
                  main.clj:  435  clojure.main/repl/fn
                  main.clj:  435  clojure.main/repl
                  main.clj:  345  clojure.main/repl
               RestFn.java: 1523  clojure.lang.RestFn/invoke
    interruptible_eval.clj:   79  nrepl.middleware.interruptible-eval/evaluate
    interruptible_eval.clj:   55  nrepl.middleware.interruptible-eval/evaluate
    interruptible_eval.clj:  142  nrepl.middleware.interruptible-eval/interruptible-eval/fn/fn
                  AFn.java:   22  clojure.lang.AFn/run
               session.clj:  171  nrepl.middleware.session/session-exec/main-loop/fn
               session.clj:  170  nrepl.middleware.session/session-exec/main-loop
                  AFn.java:   22  clojure.lang.AFn/run
               Thread.java:  834  java.lang.Thread/run

I added :jvm-opts ["--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED"] to project.clj

mkl was installed using this guide. I realise CBLAS is part of the mkl library which makes an issue with my installation of mkl the most likely cause of the issue. How could I verify that?

LazerJesus commented 5 years ago

.bashrc contains source /opt/intel/mkl/bin/mklvars.sh intel64 and

$ echo $MKLROOT
> /opt/intel/compilers_and_libraries_2019.3.199/linux/mkl
$ echo $LD_LIBRARY_PATH
> /opt/intel/compilers_and_libraries_2019.3.199[...]
blueberry commented 5 years ago

Clone the neanderthal repository, go to the hello world subdirectory, start the repl with lein repl (but don't forget to add these :jvm_opts), load the hello world namespace and evaluate the example code in the repl. If it doesn't work, MKL is not set properly (although I don't use debian so can't say anything about their package. You can also download MKL installer from intel that is more likely to work. To use Neanderthal, you don't even have to have that MKLROOT set - it is enough to have *.so on the path).

LazerJesus commented 5 years ago

I figured it out. The issue was with the env var. the LD_LIBRARY_PATH needed to include the intel64 architecture, because thats how the /opt/intel dir happened to be layed out. all the *.so files are located in the specified [...]/linux/mkl/lib/intel64 directory.

To test if the environment variable is accessible in the repl, run (System/getenv "LD_LIBRARY_PATH")

FIX Guide: Either have the env var set in bash (.bashrc or manually):

$ export LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries/linux/mkl/lib/intel64:/opt/intel/lib

or set it inside inside of project.clj.

  :plugins [[lein-with-env-vars "0.1.0"]]
  :env-vars {:LD_LIBRARY_PATH "/opt/intel/compilers_and_libraries/linux/mkl/lib/intel64:/opt/intel/lib"})

in this case it's important to start the repl via $ lein with-env-vars repl

I hope this saves someone a day of effort :)

kwccoin commented 5 years ago

I put my comment on that the closed one and may be I should also comment on this as well:

I still have this problem even I use these lines:

user=> (with-release [x (dv 0.3 0.9)
  #_=>                w1 (dge 4 2 [0.3 0.6
  #_=>                             0.1 2.0
  #_=>                             0.9 3.7
  #_=>                             0.0 1.0]
  #_=>                        {:layout :row})
  #_=>                h1 (dv 4)]
  #_=>   (println (mv! w1 x h1)))
#RealBlockVector[double, n:4, offset: 0, stride:1]
[   0.63    1.83    3.60    0.90 ]

Execution error (IllegalAccessError) at uncomplicate.commons.core/eval1628$fn (core.clj:69).
class uncomplicate.commons.core$eval1628$fn__1629 (in unnamed module @0x7edf5e87) cannot access class jdk.internal.ref.Cleaner (in module java.base) because module java.base does not export jdk.internal.ref to unnamed module @0x7edf5e87

Project.cli

(defproject hello-world "0.23.1"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.10.0"]
                 [uncomplicate/neanderthal "0.23.1"]]
  :exclusions [[org.jcuda/jcuda-natives :classifier "apple-x86_64"]
               [org.jcuda/jcublas-natives :classifier "apple-x86_64"]]
  :jvm-opts ^:replace ["-Dclojure.compiler.direct-linking=true" 
                       #_"--add-opens=java.base/jdk.internal.ref=ALL-UNNAMED"]
)

BTW, I tried to add those PATH env variable to the lein env but it does not work as

lein with-env-vars repl
'with-env-vars' is not a task. See 'lein help'.

I suspect it is not related as I can see that the path has all those paths there in the Windows Systems Path (unless one must use library path. If so then the above with-env-vars prevent me to add those into the library.

(System/getenv "PATH")
"C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\tbb\\vc_mt;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\mkl;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\compiler;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\tbb\\vc_mt;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\mkl;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\compiler;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\tbb\\vc_mt;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\mkl;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\compiler;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\tbb\\vc_mt;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\mkl;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\compiler;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\tbb\\vc_mt;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\mkl;C:\\Program Files (x86)\\IntelSWTools\\compilers_and_libraries_2019\\windows\\redist\\intel64_win\\compiler;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v10.0\\bin; ....

After I remove the user local temp, the matrix calculation is not running and hence it is not even worst

user=> (with-release [x (dv 0.3 0.9)
  #_=>                w1 (dge 4 2 [0.3 0.6
  #_=>                             0.1 2.0
  #_=>                             0.9 3.7
  #_=>                             0.0 1.0]
  #_=>                        {:layout :row})
  #_=>                h1 (dv 4)]
  #_=>   (println  w1 x h1))
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Execution error (IllegalAccessError) at uncomplicate.commons.core/eval1620$fn (core.clj:69).
class uncomplicate.commons.core$eval1620$fn__1621 (in unnamed module @0x52bc89ad) cannot access class jdk.internal.ref.Cleaner (in module java.base) because module java.base does not export jdk.internal.ref to unnamed module @0x52bc89ad

#RealGEMatrix[double, mxn:4x2, layout:row, offset:0]user=>

I further set both LD_LIBRARY_PATH and DYLD_LIBRARY_PATH to %PATH% and ensure it is seen by system using (System/getenv "LD_LIBRARY_PATH") and (System/getenv "DYLD_LIBRARY_PATH") still not ok and ...