Closed apeforest closed 4 years ago
@PatricZhao @TaoLv I will appreciate if you guys could provide input to this problem. I am working on a PR https://github.com/apache/incubator-mxnet/pull/16735 recently that requires me to build with MKL blas and MKLDNN. However, I found it not a good experience using MKL. Please see my other issue opened: https://github.com/apache/incubator-mxnet/issues/13881.
Since our man page encourages users to build mxnet with MKL blas, it would be better if they could use this feature seamlessly.
Have you ever tried the suggestion in the error message to export KMP_DUPLICATE_LIB_OK=TRUE
before running the test? This is also the interoperability problem of different omp runtimes I mentioned in https://github.com/apache/incubator-mxnet/issues/16891.
export KMP_DUPLICATE_LIB_OK=TRUE works. Closing this issue.
Description
I build mxnet with MKL blas and MKLDNN. The build was successful, however, when I ran unit test, I got the following core dump. Note: I am using master branch so there is no local changes from me.
Error Message
test_mkldnn.test_convolution ... [23:12:00] ../src/executor/graph_executor.cc:2062: Subgraph backend MKLDNN is activated. [23:12:00] ../src/executor/../operator/../common/utils.h:472: Storage type fallback detected: operator = Convolution input storage types = [row_sparse, row_sparse, row_sparse, ] output storage types = [default, ] params = {"num_filter" : 4, "kernel" : (3,), "stride" : 2, } context.dev_mask = cpu The operator with default storage type will be dispatched for execution. You're seeing this warning message because the operator above is unable to process the given ndarrays with specified storage types, context and parameter. Temporary dense ndarrays are generated in order to execute the operator. This does not affect the correctness of the programme. You can set environment variable MXNET_STORAGE_FALLBACK_LOG_VERBOSE to 0 to suppress this warning. [23:12:00] ../src/executor/../operator/../common/utils.h:472: Storage type fallback detected: operator = _backward_Convolution input storage types = [default, row_sparse, row_sparse, row_sparse, ] output storage types = [default, default, default, ] params = {"num_filter" : 4, "kernel" : (3,), "stride" : 2, } context.dev_mask = cpu The operator with default storage type will be dispatched for execution. You're seeing this warning message because the operator above is unable to process the given ndarrays with specified storage types, context and parameter. Temporary dense ndarrays are generated in order to execute the operator. This does not affect the correctness of the programme. You can set environment variable MXNET_STORAGE_FALLBACK_LOG_VERBOSE to 0 to suppress this warning. OMP: Error #15: Initializing libiomp5.so, but found libomp.so already initialized. OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/. Aborted (core dumped)
To Reproduce
1.
2.
3.
Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below: