easybuilders / easybuild-easyconfigs

A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
https://easybuild.io
GNU General Public License v2.0
378 stars 701 forks source link

Build problem with mkl-dnn #6830

Open verdurin opened 6 years ago

verdurin commented 6 years ago

Trying to build mkl-dnn/0.13-intel-2018a as a requirement of PyTorch:

[ 18%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_convolution.cpp.o
cd /dev/shm/mkldnn/0.13/intel-2018a/easybuild_obj/src && /mgmt/modules/eb/software/icc/2018.1.163-GCC-6.4.0-2.28/compilers_and_libraries_2018.1.163/linux/bin/intel64/icpc  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DUSE
_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/mgmt/modules/eb/software/imkl/2018.1.163-iimpi-2018a/mkl/include -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/include -I/dev
/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/common -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/xbyak  -O2 -xHost -ftz -fp-speculation=safe -fp-mo
del source -std=c++11 -fvisibility-inlines-hidden -diag-disable:15552  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -xHOST -qopenmp -fstack-protector -fPIC -Wformat -Wformat-security -O3 -DNDEBUG -D_
FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/jit_avx2_convolution.cpp.o -c /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_convolution.cpp
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.hpp(22),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.cpp(19):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.hpp(22),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reducer.hpp(28),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reducer.cpp(24):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.hpp(22),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reducer.hpp(28),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx512_core_u8s8s32x_1x1_convolution.hpp(23),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_engine.cpp(27):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_reorder.hpp(29),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reorder.cpp(23):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^
[ 18%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_gemm_f32.cpp.o
cd /dev/shm/mkldnn/0.13/intel-2018a/easybuild_obj/src && /mgmt/modules/eb/software/icc/2018.1.163-GCC-6.4.0-2.28/compilers_and_libraries_2018.1.163/linux/bin/intel64/icpc  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DUSE
_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/mgmt/modules/eb/software/imkl/2018.1.163-iimpi-2018a/mkl/include -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/include -I/dev
/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/common -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/xbyak  -O2 -xHost -ftz -fp-speculation=safe -fp-model source -std=c++11 -fvisibility-inlines-hidden -diag-disable:15552  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -xHOST -qopenmp -fstack-protector -fPIC -Wformat -Wformat-security -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/jit_avx2_gemm_f32.cpp.o -c /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_gemm_f32.cpp
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_gemm_f32.hpp(21),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/gemm_convolution.hpp(23),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/gemm_convolution.cpp(20):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

[ 19%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/jit_avx512_common_1x1_conv_kernel.cpp.o
cd /dev/shm/mkldnn/0.13/intel-2018a/easybuild_obj/src && /mgmt/modules/eb/software/icc/2018.1.163-GCC-6.4.0-2.28/compilers_and_libraries_2018.1.163/linux/bin/intel64/icpc  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DUSE_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/mgmt/modules/eb/software/imkl/2018.1.163-iimpi-2018a/mkl/include -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/include -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/common -I/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/xbyak  -O2 -xHost -ftz -fp-speculation=safe -fp-model source -std=c++11 -fvisibility-inlines-hidden -diag-disable:15552  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -xHOST -qopenmp -fstack-protector -fPIC -Wformat -Wformat-security -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/jit_avx512_common_1x1_conv_kernel.cpp.o -c /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx512_common_1x1_conv_kernel.cpp
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_1x1_conv_kernel_f32.hpp(21),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_1x1_conv_kernel_f32.cpp(23):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/cpu_barrier.cpp.o] Error 2
make[2]: *** Waiting for unfinished jobs....
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.hpp(22),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reducer.hpp(28),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_1x1_convolution.hpp(23),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_1x1_convolution.cpp(20):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reducer.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/cpu_reducer.cpp.o] Error 2
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_barrier.hpp(22),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reducer.hpp(28),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_convolution.hpp(23),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_convolution.cpp(20):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_conv_kernel_f32.hpp(21),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_conv_kernel_f32.cpp(23):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_1x1_conv_kernel_f32.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_1x1_conv_kernel_f32.cpp.o] Error 2
compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/gemm_convolution.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/gemm_convolution.cpp.o] Error 2
compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_1x1_convolution.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_1x1_convolution.cpp.o] Error 2
compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_conv_kernel_f32.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_conv_kernel_f32.cpp.o] Error 2
compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_convolution.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_convolution.cpp.o] Error 2
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_gemm_f32.hpp(21),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_gemm_f32.cpp(22):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_engine.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/cpu_engine.cpp.o] Error 2
In file included from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_uni_1x1_conv_utils.hpp(24),
                 from /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx512_common_1x1_conv_kernel.cpp(24):
/dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_generator.hpp(719): error #2218: result of call is not used
                  fwrite(code, getSize(), 1, fp);
                  ^

compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx2_gemm_f32.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/jit_avx2_gemm_f32.cpp.o] Error 2
compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/jit_avx512_common_1x1_conv_kernel.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/jit_avx512_common_1x1_conv_kernel.cpp.o] Error 2
compilation aborted for /dev/shm/mkldnn/0.13/intel-2018a/mkl-dnn-0.13/src/cpu/cpu_reorder.cpp (code 2)
make[2]: *** [src/CMakeFiles/mkldnn.dir/cpu/cpu_reorder.cpp.o] Error 2
make[2]: Leaving directory `/dev/shm/mkldnn/0.13/intel-2018a/easybuild_obj'
make[1]: *** [src/CMakeFiles/mkldnn.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

Build host is Ivy Bridge, hence no avx2,

boegel commented 6 years ago

@verdurin That issue is fixed in a more recent version of mkl-dnn, see https://github.com/intel/mkl-dnn/issues/53 + https://github.com/intel/mkl-dnn/commit/ab84d723f1988c4b11877e320182aa59428839b3 .

Can you try the patch (see https://github.com/intel/mkl-dnn/commit/ab84d723f1988c4b11877e320182aa59428839b3.patch)?

verdurin commented 6 years ago

@boegel looks like that patch is already included in the 0.13 release?: https://github.com/intel/mkl-dnn/blob/v0.13/tests/gtests/gtest/src/gtest.cc#L3868-L3870

boegel commented 6 years ago

@verdurin Maybe in the tests, but you need a similar construction in src/cpu/jit_generator.hpp for the failing line?

verdurin commented 6 years ago

@boegel yes. For reference, v0.16 builds and runs all the tests fine. Maybe easier and cleaner to add that instead of patching the old one, then update the PyTorch easyconfig, or would you prefer the less disruptive change?

boegel commented 6 years ago

@verdurin I prefer not updating (dependencies in) existing easyconfig files when we can avoid it, it often ends up being more painful than expected. Unless those easyconfig file are only in develop currently, then it's worth considering.

patbel-pwr commented 5 years ago

I had to modify my src/cpu/jit_generator.hpp in a bit different manner than it is presented in that patch. It may not be obvious, because patch uses pfile which in my case is defined as fp, so I'll just paste corrected lines that works for me:


717             // Failure to dump code is not fatal
718             if (fp) {
719             size_t unused = fwrite("0", 1, 1, fp);
720             (void)unused;
721                 fclose(fp);