BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.14k stars 18.67k forks source link

Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so #3884

Closed linhj184169280 closed 8 years ago

linhj184169280 commented 8 years ago

The BLAS that I choosed in Makefile.conf is atlas, and I compile the caffe with pycaffe.

make test and make runtest is okay, but when I "import caffe" in python, it tells me "Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so". So what happened to my caffe?

seanbell commented 8 years ago

From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

When reporting a bug, it's most helpful to provide the following information, where applicable:

  • What steps reproduce the bug?
  • Can you reproduce the bug using the latest master, compiled with the DEBUG make option?
  • What hardware and operating system/distribution are you running?
  • If the bug is a crash, provide the backtrace (usually printed by Caffe; always obtainable with gdb).

Did you compile pycaffe with mkl? If you compiled with mkl in the past, you should make clean before recompiling.

pcgreat commented 8 years ago

Are you using Anaconda? The problem might not be related with caffe. Try

python -c 'import sklearn.linear_model.tests.test_randomized_l1'

If you can reproduce the error, that means the problem is not related with caffe but anaconda. The latest version of numpy and scipy uses mkl by default. If you want to disable that, you can execute

conda install nomkl

that solves my problem. Hope that can solve yours, too. More details: https://github.com/scikit-learn/scikit-learn/issues/5046 https://www.continuum.io/blog/developer-blog/anaconda-25-release-now-mkl-optimizations

jskDr commented 8 years ago

conda install nomkl numpy scipy scikit-learn numexpr conda remove mkl mkl-service

The above answer in the site must not be a solution. If it happens still, the default mode of Anaconda should be nomkl as soon as possible at least in Ubuntu. What do you guys think?

seanbell commented 8 years ago

Closing due to lack of reply from @linhj184169280, and to clean up the Issues page.

yanirj commented 8 years ago

Hi,

Just wanted to note that Anaconda 4.0.0, shipped with mkl enabled by default, has this issue. The problem is indeed with Anaconda, as it can be reproduced with the python sklearn test suggested above by @pcgreat.

The actual issue is that Anaconda linked with mkl, but not with libmkl_core.so, thus it has a missing symbol, and can be seen by running:

$ LD_DEBUG=symbols python -c 'import sklearn.linear_model.tests.test_randomized_l1' 2>&1 | grep -i error
      2200:     /opt/anaconda/lib/python2.7/site-packages/scipy/special/../../../../libmkl_avx.so: error: symbol lookup error: undefined symbol: mkl_dft_fft_fix_twiddle_table_32f (fatal)

I didn't want to uninstall mkl, as I'd like to have the performance boost, so I found a workaround which worked for me - preload libmkl_core.so before execution.

$ python -c 'import sklearn.linear_model.tests.test_randomized_l1'
Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so.
$
$ LD_PRELOAD=/opt/anaconda/lib/libmkl_core.so python -c 'import sklearn.linear_model.tests.test_randomized_l1'
$

Regards, Yanir.

jczaja commented 8 years ago

@yanirj To use MKL properly it is required to set its environment using provided script. Usually something like that: source /opt/intel/mkl/bin/mklvars.sh intel64

This is just example, MKL may be installed in different directory , and argument given match requested architecture (intel64 in that case). More options is available , but I gave You the most common one. Please try it and see if it works if you haven't used it. Let us know if this sorted out issue You are observing.

Regards, Jacek

ibmua commented 8 years ago

Updating via conda install mkl solved it for me. It seems to have updated several modules including mkl, mkl-service and numpy.

ajschumacher commented 8 years ago

Thanks @jskDr! Your solution helped me!

victoriastuart commented 8 years ago

Hello: I found this thread while reseaching this MKL error, and summarized my answer here (related thread):

https://github.com/ContinuumIO/anaconda-issues/issues/720

TLDR:

conda install  -f  numpy

worked for me;

conda install mkl

did not. :-)

ujsyehao commented 7 years ago

I have solved the problem,this is the tutorial https://docs.continuum.io/mkl-optimizations/, the command is: 1.conda update conda 2.conda update anaconda 3.conda update mkl

melvyniandrag commented 7 years ago

I had this issue with gensim. This worked:

$bash Anaconda-xxxxxx # script name for the fresh install
$ pip install --upgrade gensim
$ conda install mkl

Strangely, swapping the last two steps does not work.

$bash Anaconda-xxxxxx # script name for the fresh install
$ conda install mkl
$ pip install --upgrade gensim
svanschalkwyk commented 7 years ago

[SOLVED] http://debugjournal.tumblr.com/post/98401758462/intel-mkl-dynamic-link-library-error

Mottotime commented 7 years ago

Following instructions from @victoriastuart and @ujsyehao , I updated mkl and anaconda. It removed the original error. But there was a new error:

Intel MKL FATAL ERROR: Error on loading function mkl_lapack_ps_mc3_dgetrf_small.

So I removed mkl and installed nomkl following @pcgreat and @jskDr. It works. Thank you all.

urinieto commented 7 years ago

I had the same problem, and it went away after updating Anaconda to the latest version (4.3.0 with Python 3.6).

trickmeyer commented 7 years ago

Just a heads up for anyone else that may end up here that this error can also be a red herring at times. I got a similar error recently due to inadvertently running a script while inside a mounted directory since behind the scenes its checking cwd and can't make sense of where things are.

cralonsov commented 7 years ago

Thanks @jskDr! I solved it using your commands, but I didn't need to remove mkl. My problem was importing scikit-image in a conda environment

iakash2604 commented 7 years ago

same here. there was no need to remove mkl. thanks @jskDr for helping me out

wgong commented 7 years ago

conda install nomkl

worked for me

Ashur59 commented 7 years ago

either with conda update mkl or with conda install nomkl, I am receiving this message which I am not sure what to do? for the first code

anaconda: 4.4.0-np112py36_0 --> custom-py36_0

What are the downsides of this action if I press "yes"? I mean should I perform update task differently later on or should I run python codes differently thereafter?

hoangcuong2011 commented 7 years ago

Thanks @pcgreat !

pavelkomarov commented 6 years ago

I had this same issue using scikit-learn 0.19 and numpy 1.13.3 when running MLPRegressor (and also with a package called pyearth running an algorithm called MARS). I believe the root of the problem was that our python is part of an Anaconda install, but scikit-learn and numpy were installed via pip, and their expectations for mkl must not agree.

Unfortunately my framework is managed by some dedicated company admins, not by me, so I haven't gotten my guy to try recompiling numpy yet. But I was able to find a workaround based on this thread: Adding export LD_PRELOAD=/path/to/anaconda/lib/libmkl_def.so:/path/to/anaconda/lib/libmkl_avx.so:/path/to/anaconda/lib/libmkl_core.so:/path/to/anaconda/lib/libmkl_intel_lp64.so:/path/to/anaconda/lib/libmkl_intel_thread.so:/path/to/anaconda/lib/libiomp5.so to my ~/.bashrc causes the problem to disappear. It's super hacky, and I'd be lying if I said I knew exactly what it's doing (but this is helpful), so I'm hoping a recompile of numpy is a cleaner fix. But at least it works.

Flamefire commented 6 years ago

Just some more info: Installing "nomkl" is not a solution! It simply disables mkl falling back to very slow functions. Trying LD_PRELOAD=~/miniconda3/lib/libmkl_avx2.so resulted in libmkl_avx2.so: undefined symbol: mkl_sparse_optimize_bsr_trsm_i8
Similar for LD_PRELOAD=~/miniconda3/lib/libmkl_sequential.so: libmkl_sequential.so: undefined symbol: mkl_spblas_ccsr0nd_uc__mmout_seq

Hence we need to find the libraries that export those symbols: find ~/miniconda3/lib/ -name "libmkl*" -exec nm --print-file -D {} \; | grep mkl_sparse_optimize_bsr_trsm_i8 That got me: libmkl_intel_thread.so, libmkl_sequential.so, libmkl_tbb_thread.so, libmkl_pgi_thread.so, libmkl_gnu_thread.so

So preloading libmkl_sequential.so should solve that. But the other symbol is remaining. Same here: find ~/miniconda3/lib/ -name "libmkl*" -exec nm --print-file -D {} \; | grep mkl_spblas_ccsr0nd_uc__mmout_seq which gave me libmkl_core.so

TLDR:
So working: LD_PRELOAD=~/miniconda3/lib/libmkl_core.so:~/miniconda3/lib/libmkl_sequential.so which is exactly what is written in http://debugjournal.tumblr.com/post/98401758462/intel-mkl-dynamic-link-library-error (Original: https://stackoverflow.com/a/21079900/1930508 from https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/300857#comment-1627042)

Note: libmkl_sequential might not be the best choice for performance. So you could try one of the thread libraries instead.

manchengfenxu commented 6 years ago

@pcgreat, it worked for me. Thanks.

Flamefire commented 6 years ago

Note: conda install nomkl means: "Remove mkl and replace by a (slow) standard version"

AAI-Armughan-Shahid commented 5 years ago

The following worked for me conda install -f numpy.

TheFoxDecoder commented 4 years ago

still not fixed!! :-(

sdahdah commented 4 years ago

The workaround proposed by @Flamefire worked best for me!

SashiDareddy commented 3 years ago

I had a similar issue using Faiss - this worked. Solution sourced from https://www.programmersought.com/article/10826550193/ and https://blog.csdn.net/qikaihuting/article/details/103526376 and : Add the following line to your ~/.bashrc file:

In my case (on WSL2 Ubuntu on Windows) the Intel MKL libraries were installed at /home/sashi/anaconda3/lib/ just update the following line pointing to the appropriate folder on your machine.

export LD_PRELOAD=/home/sashi/anaconda3/lib/libmkl_def.so:/home/sashi/anaconda3/lib/libmkl_avx.so:/home/sashi/anaconda3/lib/libmkl_core.so:/home/sashi/anaconda3/lib/libmkl_intel_lp64.so:/home/sashi/anaconda3/lib/libmkl_intel_thread.so:/home/sashi/anaconda3/lib/libiomp5.so

gorogm commented 3 years ago

For me downgrading mkl solved it: conda install mkl=2021.2.0 (Ubuntu 21.04, python 3.8)