Closed eLvErDe closed 4 years ago
Here's a list of related tickets/commits ending with "disabling openmp":
https://github.com/apache/incubator-mxnet/commit/85215b6176ef3612d198f590268a6595b86565fb https://github.com/apache/incubator-mxnet/issues/10011 https://github.com/apache/incubator-mxnet/issues/10230
Hello Adam, just to clarify: Are you talking about ARM? If yes, please specify which version exactly and how you are trying to compile. There have been some changes recently and we'd like to make sure we are on the same page. Best regards, Marco
Absolutely not, regular X64. Multiple variants being built with/without MKL-DNN and/or CUDA.
From what I saw, the difference between 1.0.0 and 1.1.1 is in root tree CMakeLists.txt
:
In 1.0.0:
if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/openmp/CMakeLists.txt)
# Intel/llvm OpenMP: https://github.com/llvm-mirror/openmp
In 1.1.0:
if(EXISTS ${CMAKE_CURRENT_SOURCE_DIR}/3rdparty/openmp/CMakeLists.txt AND SYSTEM_ARCHITECTURE STREQUAL "x86_64" AND NOT MSVC)
# Intel/llvm OpenMP: https://github.com/llvm-mirror/openmp
The main difference is that this openmp folder DID NOT exist in 1.0.0 sources tree, so it falls back building using system openmp:
else()
if(OPENMP_FOUND)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} ${OpenMP_EXE_LINKER_FLAGS}")
Regards, Adam.
@eLvErDe thanks for the question, @sandeep-krishnamurthy requesting this be labeled under Question
Also related: https://github.com/apache/incubator-mxnet/issues/8532
Actually libomp.so
should not be a dependency at all. In the cmake file the bundled version is build and I have reasons to believe that it's not being used at all, but the one that comes from the runtime, since building with OpenMP support is more a compiler flag than just a library link.
@cjolivier01 told that the bundled version should be faster than libgomp, but I don't know how much if it's used at all.
Generally I suggest removing the libomp dependency completely and use only the version provided by the running environment.
@sandeep-krishnamurthy can we get a tag of 'build' on this please
@sandeep-krishnamurthy ping @lebeg any updated on whether openmp should or shouldn't be bundled with MXNet?
Hi, Anton is currently on vacation and will be back next week.
Any update on this issue? libomp.so
in 3rdparty
causes a segfault in the build.ubuntu_gpu
docker environment for the CI testing. The cmake build passes all unit tests for gpu but a segfault occurs at the end. Here is the stack trace.
#0 0x00007efd8c17600e in __kmp_dephash_free_entries () from /work/build/3rdparty/openmp/runtime/src/libomp.so
#1 0x00007efd8c1761a1 in __kmp_dephash_free () from /work/build/3rdparty/openmp/runtime/src/libomp.so
#2 0x00007efd8c1394f9 in __kmp_free_implicit_task () from /work/build/3rdparty/openmp/runtime/src/libomp.so
#3 0x00007efd8c121c6f in __kmp_reap_thread(kmp_info*, int) () from /work/build/3rdparty/openmp/runtime/src/libomp.so
#4 0x00007efd8c129915 in __kmp_internal_end_library () from /work/build/3rdparty/openmp/runtime/src/libomp.so
#5 0x00007efd93eb3de7 in _dl_fini () at dl-fini.c:235
#6 0x00007efd938f5ff8 in __run_exit_handlers (status=0, listp=0x7efd93c805f8 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true) at exit.c:82
#7 0x00007efd938f6045 in __GI_exit (status=<optimized out>) at exit.c:104
#8 0x000000000060db0f in Py_Exit ()
#9 0x000000000060dbfa in ?? ()
#10 0x000000000060dc66 in PyErr_PrintEx ()
#11 0x000000000060ef29 in PyRun_SimpleFileExFlags ()
#12 0x000000000063fb26 in Py_Main ()
#13 0x00000000004cfeb1 in main ()
If I remove the dependency on libomp.so
and just use -fopenmp
, the segfault does not occur.
I created a PR https://github.com/apache/incubator-mxnet/pull/12160 to remove the submodule from the build.
How is the issue going now? It seems that the openmp submodule is still there. Maybe, another simple way to resolve is that: in https://github.com/apache/incubator-mxnet/blob/master/CMakeLists.txt#L432, we could first try to find openmp package by find_package
provided by the system, if not found, then we build it from the source. openmp is usually installed in a Unix-like system, but maybe not available in Windows?
I think we should give the option to choose the openmp implementation in the CMake file.
Hey, is there any update? I have to remove the 3rparty/openmp
to make it work. Also, the latest cmake set OpenMP_FOUND
instead of OPENMP_FOUND
, so I need to modify mxnet/CMakeLists.txt
too.
the openmp_found would be a separate issue. what is the exact issue you’re seeing? does ldd show more than one omp library in your build (ie libomp5, libomp, libgomp)? I’ve heard reports that recent builds are including multiple omp libraries.
recent changes involving mkl have made changes to omp linkages: https://github.com/apache/incubator-mxnet/commit/aa1074dc1704d3732ab205c43d48083ef8c69680#diff-af3b638bc2a3e6c650974192a53c7291
I am unable to get mxnet to build with cmake with CUDA disabled, general cmake build seems to be broken. Failing mkldnn build.
@cjolivier01 Here is my output:
ldd /usr/lib/libmxnet.so|grep omp
libomp.so => /usr/lib/libomp.so (0x00007fb5be6e0000)
libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007fb5be2af000)
mxnet build uses 3rparty/openmp
by default. There would be file conflict with the system's package. So I manually delete this folder, and build it successfully with cmake.
@cjolivier01 Could you please be more specific about the failing mkldnn build
? We removed iomp5 dependency in the commit but mkldnn should still work with either gomp or the LLVM omp under 3rdparty/openmp.
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:50:5: error: ‘vsLog2’ was not declared in this scope
vs##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:50:5: note: in definition of macro ‘MXNET_MKL_UNARY_MATH_FUNC’
vs##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:50:5: note: suggested alternative: ‘vsLog1p’
vs##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:50:5: note: in definition of macro ‘MXNET_MKL_UNARY_MATH_FUNC’
vs##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h: In static member function ‘static void mxnet::op::mkl_func::log2::Vectorize(mxnet::index_t, const double*, double*)’:
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:53:5: error: ‘vdLog2’ was not declared in this scope
vd##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:53:5: note: in definition of macro ‘MXNET_MKL_UNARY_MATH_FUNC’
vd##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:53:5: note: suggested alternative: ‘vdLog1p’
vd##func(static_cast<MKL_INT>(n), src, dst); \
^~
/home/coolivie/src/mxnet/src/c_api/../operator/tensor/././../mkl_functions-inl.h:53:5: note: in definition of macro ‘MXNET_MKL_UNARY_MATH_FUNC’
vd##func(static_cast<MKL_INT>(n), src, dst);
I think removing this in CMakeLists.txt may fix it: mxnet_option(USE_MKL_IF_AVAILABLE "Use MKL if found" OFF), but I think that also turns off MKLDNN
I don't get this error when building with Makefile, by the way
Could you please also share your cmake command line?
from mxnet/bld:
cmake -DUSE_CUDA=OFF ..
btw there's a problem with the mkldnn build. It pulls in libgomp always:
[coolivie@alien-51:~/src/mxnet/bld (master)]ldd libmxnet.so | grep omp
libomp.so => /usr/lib/x86_64-linux-gnu/libomp.so (0x00007fe8b9076000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fe8b712f000)
[coolivie@alien-51:~/src/mxnet/bld (master)]ldd /home/coolivie/src/mxnet/bld/3rdparty/mkldnn/src/libmkldnn.so.1
linux-vdso.so.1 (0x00007ffc267f7000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007efe046c9000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007efe0432b000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007efe040fc000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007efe03ee4000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007efe03af3000)
/lib64/ld-linux-x86-64.so.2 (0x00007efe059bf000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007efe038ef000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007efe036d0000)
From CMakeCache:
//CXX compiler libraries for OpenMP parallelization OpenMP_CXX_LIB_NAMES:STRING=gomp;pthread
Could you please also share your cmake command line?
I can get it to build by taking out the USEMKL if found and then explicitly setting MKLDNN to ON:
#mxnet_option(USE_MKL_IF_AVAILABLE "Use MKL if found" ON)
mxnet_option(USE_MKLDNN "Build with MKL-DNN support" ON)
I am guessing if it’s not failing for you it’s because it can’t find MKL anyway. on my system, I suppose it can find it.
Thank you for the investigation, @cjolivier01. We will fix that.
Regrading omp in mkldnn, we removed iomp5 becaused of license issue. We're going to support omp selection through make/cmake command line.
@cjolivier01 According to https://github.com/apache/incubator-mxnet/blob/master/docs/python_docs/python/tutorials/performance/backend/mkldnn/mkldnn_readme.md#enable-mkl-blas, you could use mklml or other blas libs, such as atlas, openblas.
Cannot reproduce the compile errors. Attached the cmake/make log. Could you please help to check? @cjolivier01 cmake_20191104_143513.txt
btw there's a problem with the mkldnn build. It pulls in libgomp always:
@cjolivier01 It seems that even MXNet is not built with MKL-DNN, there are two omp runtimes linked in libmxnet.so.
I build MXNet with:
cmake3 .. -DUSE_CUDA=0 -DUSE_LAPACK=0 -DUSE_MKL_IF_AVAILABLE=0 -DUSE_MKLDNN=0
And the ldd
prints:
[lvtao@Mlt2-clx101 build]$ ldd libmxnet.so | grep omp
libomp.so => /home/lvtao/Workspace/mxnet-official/build/3rdparty/openmp/runtime/src/libomp.so (0x00007f12e5d5d000)
libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f12e510e000)
Ok, I will take a look at that. This is a recent difference because I checked about three months ago and it wasn't pulling them both in.
When will mkldnn optionally pull in libomp?
When will mkldnn optionally pull in libomp?
@cjolivier01 I hope the static linking PR #16731 can help on this. But I need the two omp co-existing issue above to be fixed at first then I can validate the functionality and performance.
Hi @cjolivier01, may I have your update?
Busy at my day job haven’t had time to mess with it yet. I don’t think static linking is blocked by this since the behavior would probably be the same.
Yes, static linking is not blocked - in fact it was merged just now. I'm pinging just because #16805 is also asking for that. :)
I no longer see it linking gomp with static mkl:
[chriso@chriso-hades:~/src/mxnet (chriso/gomp)]ldd ./cmake-build-debug/libmxnet.so | grep omp
libomp.so => /home/chriso/src/mxnet/cmake-build-debug/3rdparty/openmp/runtime/src/libomp.so (0x00007f0ed791b000)
[chriso@chriso-hades:~/src/mxnet (chriso/gomp)]
Will try without mkl...
I do not see gomp linked without mkl:
[chriso@chriso-hades:~/src/mxnet (chriso/gomp)]ldd ./cmake-build-debug/libmxnet.so | grep omp
libomp.so => /home/chriso/src/mxnet/cmake-build-debug/3rdparty/openmp/runtime/src/libomp.so (0x00007fd52a1af000)
[chriso@chriso-hades:~/src/mxnet (chriso/gomp)]
gcc 6.5 btw
@cjolivier01 Could you please share the build command line for each? @matteosal For your information.
same as before.
In accordance with the ppmc decision, I have cleaned up this conversation.
@cjolivier01 Hi, I also encounter this issue:(
[chenxiny@mlt2-clx093 build]$ ldd libmxnet.so | grep omp
libomp.so => /home/chenxiny/incubator-mxnet/build/3rdparty/openmp/runtime/src/libomp.so (0x00007f2f22f21000)
libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f2f222d2000)
libXcomposite.so.1 => /lib64/libXcomposite.so.1 (0x00007f2f1b34a000)
[chenxiny@mlt2-clx093 build]$ cat CMakeCache.txt | grep gomp
//Install libgomp and libiomp5 library aliases for backwards compatibility
// with libgomp.
OpenMP_CXX_LIB_NAMES:STRING=gomp;pthread
OpenMP_C_LIB_NAMES:STRING=gomp;pthread
//Path to the gomp library for OpenMP
OpenMP_gomp_LIBRARY:FILEPATH=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so
FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_C:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()]
FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_CXX:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()]
//ADVANCED property for variable: OpenMP_gomp_LIBRARY
OpenMP_gomp_LIBRARY-ADVANCED:INTERNAL=1
what version of gcc?
On Tue, Nov 19, 2019 at 1:29 AM Xinyu Chen notifications@github.com wrote:
@cjolivier01 https://github.com/cjolivier01 Hi, I also encounter this issue:(
[chenxiny@mlt2-clx093 build]$ ldd libmxnet.so | grep omp libomp.so => /home/chenxiny/incubator-mxnet/build/3rdparty/openmp/runtime/src/libomp.so (0x00007f2f22f21000) libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f2f222d2000) libXcomposite.so.1 => /lib64/libXcomposite.so.1 (0x00007f2f1b34a000)
[chenxiny@mlt2-clx093 build]$ cat CMakeCache.txt | grep gomp //Install libgomp and libiomp5 library aliases for backwards compatibility // with libgomp. OpenMP_CXX_LIB_NAMES:STRING=gomp;pthread OpenMP_C_LIB_NAMES:STRING=gomp;pthread //Path to the gomp library for OpenMP OpenMP_gomp_LIBRARY:FILEPATH=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_C:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()] FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_CXX:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()] //ADVANCED property for variable: OpenMP_gomp_LIBRARY OpenMP_gomp_LIBRARY-ADVANCED:INTERNAL=1
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-mxnet/issues/11417?email_source=notifications&email_token=ACVWZ7J5EJCK7AVSAA6VLNTQUOWXTA5CNFSM4FHFGRO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEENO4CQ#issuecomment-555413002, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVWZ7NY3P77ZXLK3VVL2XLQUOWXTANCNFSM4FHFGROQ .
oh ok 4.8.5
On Tue, Nov 19, 2019 at 8:09 AM Chris Olivier cjolivier01@gmail.com wrote:
what version of gcc?
On Tue, Nov 19, 2019 at 1:29 AM Xinyu Chen notifications@github.com wrote:
@cjolivier01 https://github.com/cjolivier01 Hi, I also encounter this issue:(
[chenxiny@mlt2-clx093 build]$ ldd libmxnet.so | grep omp libomp.so => /home/chenxiny/incubator-mxnet/build/3rdparty/openmp/runtime/src/libomp.so (0x00007f2f22f21000) libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f2f222d2000) libXcomposite.so.1 => /lib64/libXcomposite.so.1 (0x00007f2f1b34a000)
[chenxiny@mlt2-clx093 build]$ cat CMakeCache.txt | grep gomp //Install libgomp and libiomp5 library aliases for backwards compatibility // with libgomp. OpenMP_CXX_LIB_NAMES:STRING=gomp;pthread OpenMP_C_LIB_NAMES:STRING=gomp;pthread //Path to the gomp library for OpenMP OpenMP_gomp_LIBRARY:FILEPATH=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_C:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()] FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_CXX:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()] //ADVANCED property for variable: OpenMP_gomp_LIBRARY OpenMP_gomp_LIBRARY-ADVANCED:INTERNAL=1
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/apache/incubator-mxnet/issues/11417?email_source=notifications&email_token=ACVWZ7J5EJCK7AVSAA6VLNTQUOWXTA5CNFSM4FHFGRO2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEENO4CQ#issuecomment-555413002, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACVWZ7NY3P77ZXLK3VVL2XLQUOWXTANCNFSM4FHFGROQ .
@cjolivier01 Hi, I also encounter this issue:(
[chenxiny@mlt2-clx093 build]$ ldd libmxnet.so | grep omp libomp.so => /home/chenxiny/incubator-mxnet/build/3rdparty/openmp/runtime/src/libomp.so (0x00007f2f22f21000) libgomp.so.1 => /lib64/libgomp.so.1 (0x00007f2f222d2000) libXcomposite.so.1 => /lib64/libXcomposite.so.1 (0x00007f2f1b34a000)
[chenxiny@mlt2-clx093 build]$ cat CMakeCache.txt | grep gomp //Install libgomp and libiomp5 library aliases for backwards compatibility // with libgomp. OpenMP_CXX_LIB_NAMES:STRING=gomp;pthread OpenMP_C_LIB_NAMES:STRING=gomp;pthread //Path to the gomp library for OpenMP OpenMP_gomp_LIBRARY:FILEPATH=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_C:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()] FIND_PACKAGE_MESSAGE_DETAILS_OpenMP_CXX:INTERNAL=[-fopenmp][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][/usr/lib/gcc/x86_64-redhat-linux/4.8.5/libgomp.so][/usr/lib64/libpthread.so][v3.1()] //ADVANCED property for variable: OpenMP_gomp_LIBRARY OpenMP_gomp_LIBRARY-ADVANCED:INTERNAL=1
What version of cmake? This may be a cmake thing.
cmake2, i'll check cmake3 later:)
cmake2, i'll check cmake3 later:)
what’s actual version? i’ll build on my machine
(base) [chenxiny@mlt2-clx093 ~]$ cmake --version cmake version 2.8.12.2
@cjolivier01 typo, i actually use tao's command line in this issue:
cmake3 .. -DUSE_CUDA=0 -DUSE_LAPACK=0 -DUSE_MKL_IF_AVAILABLE=0 -DUSE_MKLDNN=0
(base) [chenxiny@mlt2-clx093 build]$ cmake3 --version
cmake3 version 3.13.1
Hello,
I've read nearly most related ticket which usually ends up disabling openmp. Sounds like the worst idea ever.
Is it possible to have an official statement regarding this ? mxnet 1.0.0 was building just fine with -DUSE_OPENMP=1 without adding this unresolved library dependency.
Do you know what has been changed ? Is that really solved ? By which commit, which release ?
Thanks in advance,
Best regards, Adam.