apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

error: ‘__cpuid’ was not declared in this scope #14002

Closed mahmoodn closed 5 years ago

mahmoodn commented 5 years ago

It the latest version compatible with CUDA 10? I get the following error during the build:

[ 27%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/cpu_barrier.cpp.o
cd /home/mahmood/mx/mxnet/3rdparty/mkldnn/build/src && /usr/bin/c++  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DMKLDNN_THR=MKLDNN_THR_OMP -DUSE_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/external/mklml_lnx_2019.0.1.20180928/include -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/include -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/src -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/src/common -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/xbyak  -fopenmp -std=c++11 -fvisibility-inlines-hidden  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -mtune=generic -fPIC -Wformat -Wformat-security -fstack-protector-strong  -Wmissing-field-initializers  -Wno-strict-overflow  -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/cpu_barrier.cpp.o -c /home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/cpu_barrier.cpp
In file included from /home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/cpu_isa_traits.hpp:35:0,
                 from /home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/jit_generator.hpp:21,
                 from /home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/cpu_barrier.hpp:22,
                 from /home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/cpu_barrier.cpp:19:
/home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/xbyak/xbyak_util.h: In static member function ‘static void Xbyak::util::Cpu::getCpuid(unsigned int, unsigned int*)’:
/home/mahmood/mx/mxnet/3rdparty/mkldnn/src/cpu/xbyak/xbyak_util.h:227:3: error: ‘__cpuid’ was not declared in this scope
   __cpuid(eaxIn, data[0], data[1], data[2], data[3]);
   ^~~~~~~

Some information about system specs are presented below

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:    18.04
Codename:   bionic
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.3.0-27ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04) 
$ ~/NVIDIA_CUDA-10.0_Samples/1_Utilities/deviceQuery/deviceQuery 
/home/mahmood/NVIDIA_CUDA-10.0_Samples/1_Utilities/deviceQuery/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Quadro M2000"
  CUDA Driver Version / Runtime Version          10.0 / 10.0
  CUDA Capability Major/Minor version number:    5.2
  Total amount of global memory:                 4041 MBytes (4236902400 bytes)
  ( 6) Multiprocessors, (128) CUDA Cores/MP:     768 CUDA Cores
  GPU Max Clock rate:                            1162 MHz (1.16 GHz)
  Memory Clock rate:                             3303 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 786432 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            No
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 38 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.0, CUDA Runtime Version = 10.0, NumDevs = 1
Result = PASS

Any guess?

mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Build

frankfliu commented 5 years ago

@mxnet-label-bot add [cuda, build]

vdantu commented 5 years ago

@mxnet-label-bot update [Build, mkldnn]

vdantu commented 5 years ago

@azai91 @mseth10 : could you guys help?

@mahmoodn : could you share you config.mk file? I tried to build on a CUDA 10 container and it seems to be building beyond where you got an error.

[ 27%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/cpu_barrier.cpp.o
cd /mxnet/3rdparty/mkldnn/build/src && /usr/bin/c++  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DMKLDNN_THR=MKLDNN_THR_OMP -DUSE_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/mxnet/3rdparty/mkldnn/build/install/include -I/mxnet/3rdparty/mkldnn/include -I/mxnet/3rdparty/mkldnn/src -I/mxnet/3rdparty/mkldnn/src/common -I/mxnet/3rdparty/mkldnn/src/cpu/xbyak  -fopenmp -std=c++11 -fvisibility-inlines-hidden  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -mtune=generic -fPIC -Wformat -Wformat-security -fstack-protector-strong  -Wmissing-field-initializers  -Wno-strict-overflow  -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/cpu_barrier.cpp.o -c /mxnet/3rdparty/mkldnn/src/cpu/cpu_barrier.cpp
[ 28%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/cpu_batch_normalization_utils.cpp.o
cd /mxnet/3rdparty/mkldnn/build/src && /usr/bin/c++  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DMKLDNN_THR=MKLDNN_THR_OMP -DUSE_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/mxnet/3rdparty/mkldnn/build/install/include -I/mxnet/3rdparty/mkldnn/include -I/mxnet/3rdparty/mkldnn/src -I/mxnet/3rdparty/mkldnn/src/common -I/mxnet/3rdparty/mkldnn/src/cpu/xbyak  -fopenmp -std=c++11 -fvisibility-inlines-hidden  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -mtune=generic -fPIC -Wformat -Wformat-security -fstack-protector-strong  -Wmissing-field-initializers  -Wno-strict-overflow  -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/cpu_batch_normalization_utils.cpp.o -c /mxnet/3rdparty/mkldnn/src/cpu/cpu_batch_normalization_utils.cpp
[ 29%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/cpu_concat.cpp.o
cd /mxnet/3rdparty/mkldnn/build/src && /usr/bin/c++  -DMKLDNN_DLL -DMKLDNN_DLL_EXPORTS -DMKLDNN_THR=MKLDNN_THR_OMP -DUSE_CBLAS -DUSE_MKL -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS -Dmkldnn_EXPORTS -I/mxnet/3rdparty/mkldnn/build/install/include -I/mxnet/3rdparty/mkldnn/include -I/mxnet/3rdparty/mkldnn/src -I/mxnet/3rdparty/mkldnn/src/common -I/mxnet/3rdparty/mkldnn/src/cpu/xbyak  -fopenmp -std=c++11 -fvisibility-inlines-hidden  -Wall -Werror -Wno-unknown-pragmas -fvisibility=internal -mtune=generic -fPIC -Wformat -Wformat-security -fstack-protector-strong  -Wmissing-field-initializers  -Wno-strict-overflow  -O3 -DNDEBUG -D_FORTIFY_SOURCE=2 -fPIC   -std=gnu++11 -o CMakeFiles/mkldnn.dir/cpu/cpu_concat.cpp.o -c /mxnet/3rdparty/mkldnn/src/cpu/cpu_concat.cpp
[ 30%] Building CXX object src/CMakeFiles/mkldnn.dir/cpu/cpu_engine.cpp.o

lsb_release -a

root@66c114dc82ec:/mxnet/3rdparty/mkldnn/build/src# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:    18.04
Codename:   bionic

container name

nvidia/cuda:10.0-cudnn7-runtime-ubuntu18.04

cuda

cat /usr/local/cuda/version.txt
CUDA Version 10.0.130

Feel free to correct me if I doing anything wrong.

mahmoodn commented 5 years ago

@vdantu I see some commits during the 5 days after my post. Please let me build with the lastest version and I will come back.

mahmoodn commented 5 years ago

It seems that I was able to compile the latest git version. I will close the issue. The last thing in the build output is

g++ -DMSHADOW_FORCE_STREAM -Wall -Wsign-compare -O3 -DNDEBUG=1 -I/home/mahmood/mx/mxnet/3rdparty/mshadow/ -I/home/mahmood/mx/mxnet/3rdparty/dmlc-core/include -fPIC -I/home/mahmood/mx/mxnet/3rdparty/tvm/nnvm/include -I/home/mahmood/mx/mxnet/3rdparty/dlpack/include -I/home/mahmood/mx/mxnet/3rdparty/tvm/include -Iinclude -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -I/usr/local/cuda/include -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/include -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -DMSHADOW_USE_PASCAL=0 -DMXNET_USE_MKLDNN=1 -DUSE_MKL=1 -I/home/mahmood/mx/mxnet/src/operator/nn/mkldnn/ -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/include -DMXNET_USE_OPENCV=1 -I/usr/include/opencv -fopenmp -DMXNET_USE_OPERATOR_TUNING=1 -DMSHADOW_USE_CUDNN=1  -I/home/mahmood/mx/mxnet/3rdparty/cub -DMXNET_ENABLE_CUDA_RTC=1 -DMXNET_USE_NCCL=0 -DMXNET_USE_LIBJPEG_TURBO=0 -shared -o lib/libmxnet.so build/src/operator/quantization/mkldnn/mkldnn_quantized_conv.o build/src/operator/quantization/mkldnn/mkldnn_quantized_pooling.o build/src/operator/quantization/mkldnn/mkldnn_quantized_concat.o build/src/operator/subgraph/mkldnn/mkldnn_conv_property.o build/src/operator/subgraph/mkldnn/mkldnn_conv_post_quantize_property.o build/src/operator/subgraph/mkldnn/mkldnn_conv.o build/src/operator/nn/mkldnn/mkldnn_convolution.o build/src/operator/nn/mkldnn/mkldnn_concat.o build/src/operator/nn/mkldnn/mkldnn_base.o build/src/operator/nn/mkldnn/mkldnn_slice.o build/src/operator/nn/mkldnn/mkldnn_act.o build/src/operator/nn/mkldnn/mkldnn_softmax.o build/src/operator/nn/mkldnn/mkldnn_deconvolution.o build/src/operator/nn/mkldnn/mkldnn_copy.o build/src/operator/nn/mkldnn/mkldnn_fully_connected.o build/src/operator/nn/mkldnn/mkldnn_pooling.o build/src/operator/nn/mkldnn/mkldnn_sum.o build/src/operator/nn/cudnn/cudnn_algoreg.o build/src/operator/nn/cudnn/cudnn_batch_norm.o build/src/operator/tensor/elemwise_binary_broadcast_op_basic.o build/src/operator/tensor/elemwise_binary_op_logic.o build/src/operator/tensor/square_sum.o build/src/operator/tensor/matrix_op.o build/src/operator/tensor/init_op.o build/src/operator/tensor/elemwise_sum.o build/src/operator/tensor/la_op.o build/src/operator/tensor/histogram.o build/src/operator/tensor/broadcast_reduce_op_index.o build/src/operator/tensor/dot.o build/src/operator/tensor/elemwise_scatter_op.o build/src/operator/tensor/elemwise_unary_op_basic.o build/src/operator/tensor/elemwise_binary_broadcast_op_extended.o build/src/operator/tensor/ravel.o build/src/operator/tensor/broadcast_reduce_op_value.o build/src/operator/tensor/control_flow_op.o build/src/operator/tensor/elemwise_binary_op_basic.o build/src/operator/tensor/elemwise_binary_op_extended.o build/src/operator/tensor/indexing_op.o build/src/operator/tensor/elemwise_binary_broadcast_op_logic.o build/src/operator/tensor/diag_op.o build/src/operator/tensor/ordering_op.o build/src/operator/tensor/sparse_retain.o build/src/operator/tensor/elemwise_binary_scalar_op_extended.o build/src/operator/tensor/elemwise_binary_scalar_op_basic.o build/src/operator/tensor/elemwise_binary_scalar_op_logic.o build/src/operator/tensor/cast_storage.o build/src/operator/tensor/elemwise_binary_op.o build/src/operator/tensor/elemwise_unary_op_trig.o build/src/operator/contrib/tensorrt.o build/src/operator/contrib/multibox_target.o build/src/operator/contrib/sync_batch_norm.o build/src/operator/contrib/count_sketch.o build/src/operator/contrib/roi_align.o build/src/operator/contrib/bilinear_resize.o build/src/operator/contrib/nnz.o build/src/operator/contrib/multibox_detection.o build/src/operator/contrib/nnvm_to_onnx.o build/src/operator/contrib/deformable_psroi_pooling.o build/src/operator/contrib/dgl_graph.o build/src/operator/contrib/fft.o build/src/operator/contrib/multibox_prior.o build/src/operator/contrib/gradient_multiplier_op.o build/src/operator/contrib/adamw.o build/src/operator/contrib/transformer.o build/src/operator/contrib/krprod.o build/src/operator/contrib/multi_proposal.o build/src/operator/contrib/index_copy.o build/src/operator/contrib/optimizer_op.o build/src/operator/contrib/bounding_box.o build/src/operator/contrib/proposal.o build/src/operator/contrib/boolean_mask.o build/src/operator/contrib/psroi_pooling.o build/src/operator/contrib/quadratic_op.o build/src/operator/contrib/deformable_convolution.o build/src/operator/contrib/ifft.o build/src/operator/contrib/adaptive_avg_pooling.o build/src/operator/random/sample_multinomial_op.o build/src/operator/random/multisample_op.o build/src/operator/random/unique_sample_op.o build/src/operator/random/sample_op.o build/src/operator/random/shuffle_op.o build/src/operator/quantization/requantize.o build/src/operator/quantization/dequantize.o build/src/operator/quantization/quantize_graph_pass.o build/src/operator/quantization/quantized_flatten.o build/src/operator/quantization/quantized_conv.o build/src/operator/quantization/quantized_fully_connected.o build/src/operator/quantization/quantized_pooling.o build/src/operator/quantization/quantized_concat.o build/src/operator/quantization/quantize.o build/src/operator/custom/native_op.o build/src/operator/custom/ndarray_op.o build/src/operator/custom/custom.o build/src/operator/subgraph/partition_graph.o build/src/operator/subgraph/default_subgraph_property.o build/src/operator/nnpack/nnpack_util.o build/src/operator/image/image_random.o build/src/operator/image/resize.o build/src/operator/nn/softmax.o build/src/operator/nn/pooling.o build/src/operator/nn/deconvolution.o build/src/operator/nn/activation.o build/src/operator/nn/upsampling.o build/src/operator/nn/batch_norm.o build/src/operator/nn/ctc_loss.o build/src/operator/nn/fully_connected.o build/src/operator/nn/convolution.o build/src/operator/nn/layer_norm.o build/src/operator/nn/concat.o build/src/operator/nn/softmax_activation.o build/src/operator/nn/lrn.o build/src/operator/nn/dropout.o build/src/io/io.o build/src/io/image_aug_default.o build/src/io/iter_image_det_recordio.o build/src/io/image_io.o build/src/io/image_det_aug_default.o build/src/io/iter_csv.o build/src/io/iter_image_recordio.o build/src/io/iter_mnist.o build/src/io/iter_image_recordio_2.o build/src/io/iter_libsvm.o build/src/common/utils.o build/src/common/rtc.o build/src/nnvm/legacy_op_util.o build/src/nnvm/tvm_bridge.o build/src/nnvm/graph_editor.o build/src/nnvm/legacy_json_util.o build/src/profiler/profiler.o build/src/profiler/aggregate_stats.o build/src/profiler/vtune.o build/src/imperative/cached_op.o build/src/imperative/imperative_utils.o build/src/imperative/imperative.o build/src/ndarray/ndarray_function.o build/src/ndarray/ndarray.o build/src/operator/instance_norm.o build/src/operator/subgraph_op_common.o build/src/operator/grid_generator.o build/src/operator/pooling_v1.o build/src/operator/l2_normalization.o build/src/operator/rnn.o build/src/operator/make_loss.o build/src/operator/crop.o build/src/operator/spatial_transformer.o build/src/operator/operator.o build/src/operator/control_flow.o build/src/operator/swapaxis.o build/src/operator/convolution_v1.o build/src/operator/softmax_output.o build/src/operator/operator_util.o build/src/operator/roi_pooling.o build/src/operator/slice_channel.o build/src/operator/batch_norm_v1.o build/src/operator/loss_binary_op.o build/src/operator/regression_output.o build/src/operator/sequence_reverse.o build/src/operator/c_lapack_api.o build/src/operator/identity_attach_KL_sparse_reg.o build/src/operator/bilinear_sampler.o build/src/operator/svm_output.o build/src/operator/optimizer_op.o build/src/operator/sequence_last.o build/src/operator/cross_device_copy.o build/src/operator/correlation.o build/src/operator/pad.o build/src/operator/leaky_relu.o build/src/operator/operator_tune.o build/src/operator/sequence_mask.o build/src/engine/naive_engine.o build/src/engine/openmp.o build/src/engine/threaded_engine_pooled.o build/src/engine/threaded_engine.o build/src/engine/engine.o build/src/engine/threaded_engine_perdevice.o build/src/storage/storage.o build/src/c_api/c_api_executor.o build/src/c_api/c_api_symbolic.o build/src/c_api/c_api_profile.o build/src/c_api/c_api_ndarray.o build/src/c_api/c_api_test.o build/src/c_api/c_predict_api.o build/src/c_api/c_api_function.o build/src/c_api/c_api.o build/src/c_api/c_api_error.o build/src/executor/onnx_to_tensorrt.o build/src/executor/inplace_addto_detect_pass.o build/src/executor/graph_executor.o build/src/executor/trt_graph_executor.o build/src/executor/infer_graph_attr_pass.o build/src/executor/tensorrt_pass.o build/src/executor/attach_op_execs_pass.o build/src/executor/attach_op_resource_pass.o build/src/kvstore/gradient_compression.o build/src/kvstore/kvstore_utils.o build/src/kvstore/kvstore.o build/src/resource.o build/src/mxfeatures.o build/src/initialize.o /home/mahmood/mx/mxnet/3rdparty/dmlc-core/libdmlc.a build/src/operator/nn/cudnn/cudnn_batch_norm_gpu.o build/src/operator/tensor/elemwise_binary_op_basic_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_basic_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_extended_gpu.o build/src/operator/tensor/matrix_op_gpu.o build/src/operator/tensor/ordering_op_gpu.o build/src/operator/tensor/elemwise_unary_op_trig_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_extended_gpu.o build/src/operator/tensor/diag_op_gpu.o build/src/operator/tensor/square_sum_gpu.o build/src/operator/tensor/elemwise_binary_op_extended_gpu.o build/src/operator/tensor/elemwise_sum_gpu.o build/src/operator/tensor/init_op_gpu.o build/src/operator/tensor/cast_storage_gpu.o build/src/operator/tensor/histogram_gpu.o build/src/operator/tensor/dot_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_logic_gpu.o build/src/operator/tensor/ravel_gpu.o build/src/operator/tensor/control_flow_op_gpu.o build/src/operator/tensor/broadcast_reduce_op_value_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_basic_gpu.o build/src/operator/tensor/broadcast_reduce_op_index_gpu.o build/src/operator/tensor/elemwise_scatter_op_gpu.o build/src/operator/tensor/indexing_op_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_logic_gpu.o build/src/operator/tensor/la_op_gpu.o build/src/operator/tensor/elemwise_binary_op_logic_gpu.o build/src/operator/tensor/elemwise_unary_op_basic_gpu.o build/src/operator/tensor/sparse_retain_gpu.o build/src/operator/contrib/optimizer_op_gpu.o build/src/operator/contrib/adaptive_avg_pooling_gpu.o build/src/operator/contrib/ifft_gpu.o build/src/operator/contrib/multibox_detection_gpu.o build/src/operator/contrib/index_copy_gpu.o build/src/operator/contrib/tensorrt_gpu.o build/src/operator/contrib/multibox_target_gpu.o build/src/operator/contrib/proposal_gpu.o build/src/operator/contrib/bilinear_resize_gpu.o build/src/operator/contrib/count_sketch_gpu.o build/src/operator/contrib/dgl_graph_gpu.o build/src/operator/contrib/gradient_multiplier_op_gpu.o build/src/operator/contrib/bounding_box_gpu.o build/src/operator/contrib/fft_gpu.o build/src/operator/contrib/multibox_prior_gpu.o build/src/operator/contrib/deformable_psroi_pooling_gpu.o build/src/operator/contrib/quadratic_op_gpu.o build/src/operator/contrib/transformer_gpu.o build/src/operator/contrib/multi_proposal_gpu.o build/src/operator/contrib/adamw_gpu.o build/src/operator/contrib/sync_batch_norm_gpu.o build/src/operator/contrib/psroi_pooling_gpu.o build/src/operator/contrib/deformable_convolution_gpu.o build/src/operator/contrib/roi_align_gpu.o build/src/operator/random/shuffle_op_gpu.o build/src/operator/random/sample_multinomial_op_gpu.o build/src/operator/random/multisample_op_gpu.o build/src/operator/random/sample_op_gpu.o build/src/operator/quantization/requantize_gpu.o build/src/operator/quantization/quantize_gpu.o build/src/operator/quantization/dequantize_gpu.o build/src/operator/quantization/quantized_conv_gpu.o build/src/operator/quantization/quantized_flatten_gpu.o build/src/operator/quantization/quantized_fully_connected_gpu.o build/src/operator/quantization/quantized_pooling_gpu.o build/src/operator/custom/native_op_gpu.o build/src/operator/image/resize_gpu.o build/src/operator/image/image_random_gpu.o build/src/operator/nn/lrn_gpu.o build/src/operator/nn/dropout_gpu.o build/src/operator/nn/softmax_activation_gpu.o build/src/operator/nn/fully_connected_gpu.o build/src/operator/nn/deconvolution_gpu.o build/src/operator/nn/pooling_gpu.o build/src/operator/nn/softmax_gpu.o build/src/operator/nn/activation_gpu.o build/src/operator/nn/ctc_loss_gpu.o build/src/operator/nn/convolution_gpu.o build/src/operator/nn/upsampling_gpu.o build/src/operator/nn/batch_norm_gpu.o build/src/operator/nn/layer_norm_gpu.o build/src/operator/nn/concat_gpu.o build/src/common/utils_gpu.o build/src/common/random_generator_gpu.o build/src/ndarray/ndarray_function_gpu.o build/src/operator/svm_output_gpu.o build/src/operator/optimizer_op_gpu.o build/src/operator/spatial_transformer_gpu.o build/src/operator/make_loss_gpu.o build/src/operator/pooling_v1_gpu.o build/src/operator/instance_norm_gpu.o build/src/operator/sequence_mask_gpu.o build/src/operator/correlation_gpu.o build/src/operator/slice_channel_gpu.o build/src/operator/rnn_gpu.o build/src/operator/crop_gpu.o build/src/operator/convolution_v1_gpu.o build/src/operator/sequence_reverse_gpu.o build/src/operator/identity_attach_KL_sparse_reg_gpu.o build/src/operator/leaky_relu_gpu.o build/src/operator/swapaxis_gpu.o build/src/operator/grid_generator_gpu.o build/src/operator/pad_gpu.o build/src/operator/bilinear_sampler_gpu.o build/src/operator/roi_pooling_gpu.o build/src/operator/batch_norm_v1_gpu.o build/src/operator/loss_binary_op_gpu.o build/src/operator/regression_output_gpu.o build/src/operator/l2_normalization_gpu.o build/src/operator/sequence_last_gpu.o build/src/operator/softmax_output_gpu.o build/src/kvstore/gradient_compression_gpu.o build/src/kvstore/kvstore_utils_gpu.o -pthread -lm -lcudart -lcublas -lcurand -lcusolver -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -Wl,--as-needed -lmklml_intel -lmklml_gnu -liomp5 -L/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/lib/ -lopenblas -fopenmp -lrt -L/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/lib -lmkldnn -Wl,-rpath,'${ORIGIN}' -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core -lcudnn  -lcufft -lcuda -lnvrtc -L/usr/local/cuda/lib64/stubs \
-Wl,--whole-archive /home/mahmood/mx/mxnet/3rdparty/tvm/nnvm/lib/libnnvm.a -Wl,--no-whole-archive
g++ -DMSHADOW_FORCE_STREAM -Wall -Wsign-compare -O3 -DNDEBUG=1 -I/home/mahmood/mx/mxnet/3rdparty/mshadow/ -I/home/mahmood/mx/mxnet/3rdparty/dmlc-core/include -fPIC -I/home/mahmood/mx/mxnet/3rdparty/tvm/nnvm/include -I/home/mahmood/mx/mxnet/3rdparty/dlpack/include -I/home/mahmood/mx/mxnet/3rdparty/tvm/include -Iinclude -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -I/usr/local/cuda/include -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/include -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -DMSHADOW_USE_PASCAL=0 -DMXNET_USE_MKLDNN=1 -DUSE_MKL=1 -I/home/mahmood/mx/mxnet/src/operator/nn/mkldnn/ -I/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/include -DMXNET_USE_OPENCV=1 -I/usr/include/opencv -fopenmp -DMXNET_USE_OPERATOR_TUNING=1 -DMSHADOW_USE_CUDNN=1  -I/home/mahmood/mx/mxnet/3rdparty/cub -DMXNET_ENABLE_CUDA_RTC=1 -DMXNET_USE_NCCL=0 -DMXNET_USE_LIBJPEG_TURBO=0 -std=c++11  -o bin/im2rec tools/im2rec.cc build/src/operator/quantization/mkldnn/mkldnn_quantized_conv.o build/src/operator/quantization/mkldnn/mkldnn_quantized_pooling.o build/src/operator/quantization/mkldnn/mkldnn_quantized_concat.o build/src/operator/subgraph/mkldnn/mkldnn_conv_property.o build/src/operator/subgraph/mkldnn/mkldnn_conv_post_quantize_property.o build/src/operator/subgraph/mkldnn/mkldnn_conv.o build/src/operator/nn/mkldnn/mkldnn_convolution.o build/src/operator/nn/mkldnn/mkldnn_concat.o build/src/operator/nn/mkldnn/mkldnn_base.o build/src/operator/nn/mkldnn/mkldnn_slice.o build/src/operator/nn/mkldnn/mkldnn_act.o build/src/operator/nn/mkldnn/mkldnn_softmax.o build/src/operator/nn/mkldnn/mkldnn_deconvolution.o build/src/operator/nn/mkldnn/mkldnn_copy.o build/src/operator/nn/mkldnn/mkldnn_fully_connected.o build/src/operator/nn/mkldnn/mkldnn_pooling.o build/src/operator/nn/mkldnn/mkldnn_sum.o build/src/operator/nn/cudnn/cudnn_algoreg.o build/src/operator/nn/cudnn/cudnn_batch_norm.o build/src/operator/tensor/elemwise_binary_broadcast_op_basic.o build/src/operator/tensor/elemwise_binary_op_logic.o build/src/operator/tensor/square_sum.o build/src/operator/tensor/matrix_op.o build/src/operator/tensor/init_op.o build/src/operator/tensor/elemwise_sum.o build/src/operator/tensor/la_op.o build/src/operator/tensor/histogram.o build/src/operator/tensor/broadcast_reduce_op_index.o build/src/operator/tensor/dot.o build/src/operator/tensor/elemwise_scatter_op.o build/src/operator/tensor/elemwise_unary_op_basic.o build/src/operator/tensor/elemwise_binary_broadcast_op_extended.o build/src/operator/tensor/ravel.o build/src/operator/tensor/broadcast_reduce_op_value.o build/src/operator/tensor/control_flow_op.o build/src/operator/tensor/elemwise_binary_op_basic.o build/src/operator/tensor/elemwise_binary_op_extended.o build/src/operator/tensor/indexing_op.o build/src/operator/tensor/elemwise_binary_broadcast_op_logic.o build/src/operator/tensor/diag_op.o build/src/operator/tensor/ordering_op.o build/src/operator/tensor/sparse_retain.o build/src/operator/tensor/elemwise_binary_scalar_op_extended.o build/src/operator/tensor/elemwise_binary_scalar_op_basic.o build/src/operator/tensor/elemwise_binary_scalar_op_logic.o build/src/operator/tensor/cast_storage.o build/src/operator/tensor/elemwise_binary_op.o build/src/operator/tensor/elemwise_unary_op_trig.o build/src/operator/contrib/tensorrt.o build/src/operator/contrib/multibox_target.o build/src/operator/contrib/sync_batch_norm.o build/src/operator/contrib/count_sketch.o build/src/operator/contrib/roi_align.o build/src/operator/contrib/bilinear_resize.o build/src/operator/contrib/nnz.o build/src/operator/contrib/multibox_detection.o build/src/operator/contrib/nnvm_to_onnx.o build/src/operator/contrib/deformable_psroi_pooling.o build/src/operator/contrib/dgl_graph.o build/src/operator/contrib/fft.o build/src/operator/contrib/multibox_prior.o build/src/operator/contrib/gradient_multiplier_op.o build/src/operator/contrib/adamw.o build/src/operator/contrib/transformer.o build/src/operator/contrib/krprod.o build/src/operator/contrib/multi_proposal.o build/src/operator/contrib/index_copy.o build/src/operator/contrib/optimizer_op.o build/src/operator/contrib/bounding_box.o build/src/operator/contrib/proposal.o build/src/operator/contrib/boolean_mask.o build/src/operator/contrib/psroi_pooling.o build/src/operator/contrib/quadratic_op.o build/src/operator/contrib/deformable_convolution.o build/src/operator/contrib/ifft.o build/src/operator/contrib/adaptive_avg_pooling.o build/src/operator/random/sample_multinomial_op.o build/src/operator/random/multisample_op.o build/src/operator/random/unique_sample_op.o build/src/operator/random/sample_op.o build/src/operator/random/shuffle_op.o build/src/operator/quantization/requantize.o build/src/operator/quantization/dequantize.o build/src/operator/quantization/quantize_graph_pass.o build/src/operator/quantization/quantized_flatten.o build/src/operator/quantization/quantized_conv.o build/src/operator/quantization/quantized_fully_connected.o build/src/operator/quantization/quantized_pooling.o build/src/operator/quantization/quantized_concat.o build/src/operator/quantization/quantize.o build/src/operator/custom/native_op.o build/src/operator/custom/ndarray_op.o build/src/operator/custom/custom.o build/src/operator/subgraph/partition_graph.o build/src/operator/subgraph/default_subgraph_property.o build/src/operator/nnpack/nnpack_util.o build/src/operator/image/image_random.o build/src/operator/image/resize.o build/src/operator/nn/softmax.o build/src/operator/nn/pooling.o build/src/operator/nn/deconvolution.o build/src/operator/nn/activation.o build/src/operator/nn/upsampling.o build/src/operator/nn/batch_norm.o build/src/operator/nn/ctc_loss.o build/src/operator/nn/fully_connected.o build/src/operator/nn/convolution.o build/src/operator/nn/layer_norm.o build/src/operator/nn/concat.o build/src/operator/nn/softmax_activation.o build/src/operator/nn/lrn.o build/src/operator/nn/dropout.o build/src/io/io.o build/src/io/image_aug_default.o build/src/io/iter_image_det_recordio.o build/src/io/image_io.o build/src/io/image_det_aug_default.o build/src/io/iter_csv.o build/src/io/iter_image_recordio.o build/src/io/iter_mnist.o build/src/io/iter_image_recordio_2.o build/src/io/iter_libsvm.o build/src/common/utils.o build/src/common/rtc.o build/src/nnvm/legacy_op_util.o build/src/nnvm/tvm_bridge.o build/src/nnvm/graph_editor.o build/src/nnvm/legacy_json_util.o build/src/profiler/profiler.o build/src/profiler/aggregate_stats.o build/src/profiler/vtune.o build/src/imperative/cached_op.o build/src/imperative/imperative_utils.o build/src/imperative/imperative.o build/src/ndarray/ndarray_function.o build/src/ndarray/ndarray.o build/src/operator/instance_norm.o build/src/operator/subgraph_op_common.o build/src/operator/grid_generator.o build/src/operator/pooling_v1.o build/src/operator/l2_normalization.o build/src/operator/rnn.o build/src/operator/make_loss.o build/src/operator/crop.o build/src/operator/spatial_transformer.o build/src/operator/operator.o build/src/operator/control_flow.o build/src/operator/swapaxis.o build/src/operator/convolution_v1.o build/src/operator/softmax_output.o build/src/operator/operator_util.o build/src/operator/roi_pooling.o build/src/operator/slice_channel.o build/src/operator/batch_norm_v1.o build/src/operator/loss_binary_op.o build/src/operator/regression_output.o build/src/operator/sequence_reverse.o build/src/operator/c_lapack_api.o build/src/operator/identity_attach_KL_sparse_reg.o build/src/operator/bilinear_sampler.o build/src/operator/svm_output.o build/src/operator/optimizer_op.o build/src/operator/sequence_last.o build/src/operator/cross_device_copy.o build/src/operator/correlation.o build/src/operator/pad.o build/src/operator/leaky_relu.o build/src/operator/operator_tune.o build/src/operator/sequence_mask.o build/src/engine/naive_engine.o build/src/engine/openmp.o build/src/engine/threaded_engine_pooled.o build/src/engine/threaded_engine.o build/src/engine/engine.o build/src/engine/threaded_engine_perdevice.o build/src/storage/storage.o build/src/c_api/c_api_executor.o build/src/c_api/c_api_symbolic.o build/src/c_api/c_api_profile.o build/src/c_api/c_api_ndarray.o build/src/c_api/c_api_test.o build/src/c_api/c_predict_api.o build/src/c_api/c_api_function.o build/src/c_api/c_api.o build/src/c_api/c_api_error.o build/src/executor/onnx_to_tensorrt.o build/src/executor/inplace_addto_detect_pass.o build/src/executor/graph_executor.o build/src/executor/trt_graph_executor.o build/src/executor/infer_graph_attr_pass.o build/src/executor/tensorrt_pass.o build/src/executor/attach_op_execs_pass.o build/src/executor/attach_op_resource_pass.o build/src/kvstore/gradient_compression.o build/src/kvstore/kvstore_utils.o build/src/kvstore/kvstore.o build/src/resource.o build/src/mxfeatures.o build/src/initialize.o /home/mahmood/mx/mxnet/3rdparty/dmlc-core/libdmlc.a /home/mahmood/mx/mxnet/3rdparty/tvm/nnvm/lib/libnnvm.a build/src/operator/nn/cudnn/cudnn_batch_norm_gpu.o build/src/operator/tensor/elemwise_binary_op_basic_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_basic_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_extended_gpu.o build/src/operator/tensor/matrix_op_gpu.o build/src/operator/tensor/ordering_op_gpu.o build/src/operator/tensor/elemwise_unary_op_trig_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_extended_gpu.o build/src/operator/tensor/diag_op_gpu.o build/src/operator/tensor/square_sum_gpu.o build/src/operator/tensor/elemwise_binary_op_extended_gpu.o build/src/operator/tensor/elemwise_sum_gpu.o build/src/operator/tensor/init_op_gpu.o build/src/operator/tensor/cast_storage_gpu.o build/src/operator/tensor/histogram_gpu.o build/src/operator/tensor/dot_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_logic_gpu.o build/src/operator/tensor/ravel_gpu.o build/src/operator/tensor/control_flow_op_gpu.o build/src/operator/tensor/broadcast_reduce_op_value_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_basic_gpu.o build/src/operator/tensor/broadcast_reduce_op_index_gpu.o build/src/operator/tensor/elemwise_scatter_op_gpu.o build/src/operator/tensor/indexing_op_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_logic_gpu.o build/src/operator/tensor/la_op_gpu.o build/src/operator/tensor/elemwise_binary_op_logic_gpu.o build/src/operator/tensor/elemwise_unary_op_basic_gpu.o build/src/operator/tensor/sparse_retain_gpu.o build/src/operator/contrib/optimizer_op_gpu.o build/src/operator/contrib/adaptive_avg_pooling_gpu.o build/src/operator/contrib/ifft_gpu.o build/src/operator/contrib/multibox_detection_gpu.o build/src/operator/contrib/index_copy_gpu.o build/src/operator/contrib/tensorrt_gpu.o build/src/operator/contrib/multibox_target_gpu.o build/src/operator/contrib/proposal_gpu.o build/src/operator/contrib/bilinear_resize_gpu.o build/src/operator/contrib/count_sketch_gpu.o build/src/operator/contrib/dgl_graph_gpu.o build/src/operator/contrib/gradient_multiplier_op_gpu.o build/src/operator/contrib/bounding_box_gpu.o build/src/operator/contrib/fft_gpu.o build/src/operator/contrib/multibox_prior_gpu.o build/src/operator/contrib/deformable_psroi_pooling_gpu.o build/src/operator/contrib/quadratic_op_gpu.o build/src/operator/contrib/transformer_gpu.o build/src/operator/contrib/multi_proposal_gpu.o build/src/operator/contrib/adamw_gpu.o build/src/operator/contrib/sync_batch_norm_gpu.o build/src/operator/contrib/psroi_pooling_gpu.o build/src/operator/contrib/deformable_convolution_gpu.o build/src/operator/contrib/roi_align_gpu.o build/src/operator/random/shuffle_op_gpu.o build/src/operator/random/sample_multinomial_op_gpu.o build/src/operator/random/multisample_op_gpu.o build/src/operator/random/sample_op_gpu.o build/src/operator/quantization/requantize_gpu.o build/src/operator/quantization/quantize_gpu.o build/src/operator/quantization/dequantize_gpu.o build/src/operator/quantization/quantized_conv_gpu.o build/src/operator/quantization/quantized_flatten_gpu.o build/src/operator/quantization/quantized_fully_connected_gpu.o build/src/operator/quantization/quantized_pooling_gpu.o build/src/operator/custom/native_op_gpu.o build/src/operator/image/resize_gpu.o build/src/operator/image/image_random_gpu.o build/src/operator/nn/lrn_gpu.o build/src/operator/nn/dropout_gpu.o build/src/operator/nn/softmax_activation_gpu.o build/src/operator/nn/fully_connected_gpu.o build/src/operator/nn/deconvolution_gpu.o build/src/operator/nn/pooling_gpu.o build/src/operator/nn/softmax_gpu.o build/src/operator/nn/activation_gpu.o build/src/operator/nn/ctc_loss_gpu.o build/src/operator/nn/convolution_gpu.o build/src/operator/nn/upsampling_gpu.o build/src/operator/nn/batch_norm_gpu.o build/src/operator/nn/layer_norm_gpu.o build/src/operator/nn/concat_gpu.o build/src/common/utils_gpu.o build/src/common/random_generator_gpu.o build/src/ndarray/ndarray_function_gpu.o build/src/operator/svm_output_gpu.o build/src/operator/optimizer_op_gpu.o build/src/operator/spatial_transformer_gpu.o build/src/operator/make_loss_gpu.o build/src/operator/pooling_v1_gpu.o build/src/operator/instance_norm_gpu.o build/src/operator/sequence_mask_gpu.o build/src/operator/correlation_gpu.o build/src/operator/slice_channel_gpu.o build/src/operator/rnn_gpu.o build/src/operator/crop_gpu.o build/src/operator/convolution_v1_gpu.o build/src/operator/sequence_reverse_gpu.o build/src/operator/identity_attach_KL_sparse_reg_gpu.o build/src/operator/leaky_relu_gpu.o build/src/operator/swapaxis_gpu.o build/src/operator/grid_generator_gpu.o build/src/operator/pad_gpu.o build/src/operator/bilinear_sampler_gpu.o build/src/operator/roi_pooling_gpu.o build/src/operator/batch_norm_v1_gpu.o build/src/operator/loss_binary_op_gpu.o build/src/operator/regression_output_gpu.o build/src/operator/l2_normalization_gpu.o build/src/operator/sequence_last_gpu.o build/src/operator/softmax_output_gpu.o build/src/kvstore/gradient_compression_gpu.o build/src/kvstore/kvstore_utils_gpu.o -pthread -lm -lcudart -lcublas -lcurand -lcusolver -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -Wl,--as-needed -lmklml_intel -lmklml_gnu -liomp5 -L/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/lib/ -lopenblas -fopenmp -lrt -L/home/mahmood/mx/mxnet/3rdparty/mkldnn/build/install/lib -lmkldnn -Wl,-rpath,'${ORIGIN}' -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core -lcudnn  -lcufft -lcuda -lnvrtc -L/usr/local/cuda/lib64/stubs