Open access2rohit opened 5 years ago
Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended labels: Cuda, Build
I tried this on a p3.2xlarge and a p3.16xlarge using a DLAMI Base and saw the same error both times. Running with DEBUG off will build fine.
I have the same issue on PPC64LE using master (c319ae57), built with CMake on Linux. I don't have this issue with 1.4.0.
Other Approaches tried:
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 4.8 and g++ 4.8 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=1 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.0
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 5.4 and g++ 5.4 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=1 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.0
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 4.8 and g++ 4.8 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=1 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.2
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 5.4 and g++ 5.4 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=1 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.2
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 4.8 and g++ 4.8 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=0 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.0
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 5.4 and g++ 5.4 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=0 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.0
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 4.8 and g++ 4.8 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=0 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.2
ubuntu 16.04 GPU MXNet 1.4.x using Make, gcc 5.4 and g++ 5.4 USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=0 USE_CUDA_PATH = /usr/local/cuda DEBUG=1 /usr/local/cuda -> /usr/local/cuda-9.2
@mxnet-label-bot add [bug][build]
@mxnet-label-bot add [build]
I met the same problem.
----------Python Info----------
('Version :', '2.7.16')
('Compiler :', 'GCC 7.3.0')
('Build :', ('default', 'Mar 14 2019 21:00:58'))
('Arch :', ('64bit', ''))
------------Pip Info-----------
('Version :', '19.1.1')
('Directory :', '/home/yizhao/anaconda3/envs/python27/lib/python2.7/site-packages/pip')
----------MXNet Info-----------
An error occured trying to import mxnet.
This is very likely due to missing missing or incompatible library files.
Traceback (most recent call last):
File "diagnose.py", line 103, in check_mxnet
import mxnet
File "/home/yizhao/Code/mxnet-dev/python/mxnet/__init__.py", line 24, in <module>
from .context import Context, current_context, cpu, gpu, cpu_pinned
File "/home/yizhao/Code/mxnet-dev/python/mxnet/context.py", line 24, in <module>
from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
File "/home/yizhao/Code/mxnet-dev/python/mxnet/base.py", line 213, in <module>
_LIB = _load_lib()
File "/home/yizhao/Code/mxnet-dev/python/mxnet/base.py", line 203, in _load_lib
lib_path = libinfo.find_lib_path()
File "/home/yizhao/Code/mxnet-dev/python/mxnet/libinfo.py", line 74, in find_lib_path
'List of candidates:\n' + str('\n'.join(dll_path)))
RuntimeError: Cannot find the MXNet library.
List of candidates:
libmxnet.so
/home/yizhao/Code/mxnet/3rdparty/mkldnn/external/mklml_lnx_2019.0.5.20190502/lib/libmxnet.so
/home/yizhao/Code/mxnet_pop/3rdparty/mkldnn/build/install/lib/libmxnet.so
/usr/lib/cuda/lib64/libmxnet.so
/home/yizhao/Code/mxnet-dev/python/mxnet/libmxnet.so
/home/yizhao/Code/mxnet-dev/python/mxnet/../../lib/libmxnet.so
/home/yizhao/Code/mxnet-dev/python/mxnet/../../build/libmxnet.so
../../../libmxnet.so
----------System Info----------
('Platform :', 'Linux-4.18.0-21-generic-x86_64-with-debian-buster-sid')
('system :', 'Linux')
('node :', 'pop-os')
('release :', '4.18.0-21-generic')
('version :', '#22-Ubuntu SMP Wed May 15 13:13:21 UTC 2019')
----------Hardware Info----------
('machine :', 'x86_64')
('processor :', 'x86_64')
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
Stepping: 10
CPU MHz: 3700.339
CPU max MHz: 4100.0000
CPU min MHz: 800.0000
BogoMIPS: 4416.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 9216K
NUMA node0 CPU(s): 0-11
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0048 sec, LOAD: 1.6375 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0068 sec, LOAD: 5.5277 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.1981 sec, LOAD: 2.0450 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.1868 sec, LOAD: 1.2388 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.6114 sec, LOAD: 1.8466 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 1.0332 sec, LOAD: 1.8352 sec.
my config.mk
# whether compile with options for MXNet developer
DEV = 0
# whether compile with debug
DEBUG = 1
# whether to turn on segfault signal handler to log the stack trace
USE_SIGNAL_HANDLER = 1
USE_PROFILER = 1
# the additional link flags you want to add
ADD_LDFLAGS =
# the additional compile flags you want to add
ADD_CFLAGS =
#---------------------------------------------
# matrix computation libraries for CPU/GPU
#---------------------------------------------
# whether use CUDA during compile
USE_CUDA = 1
# add the path to CUDA library to link and compile flag
# if you have already add them to environment variable, leave it as NONE
# USE_CUDA_PATH = /usr/local/cuda
USE_CUDA_PATH = /usr/lib/cuda
# whether to enable CUDA runtime compilation
ENABLE_CUDA_RTC = 1
# whether use CuDNN R3 library
USE_CUDNN = 1
The output of make:
Makefile:219: "USE_LAPACK disabled because libraries were not found"
Makefile:345: WARNING: Significant performance increases can be achieved by installing and enabling gperftools or jemalloc development packages
INFO: nvcc was not found on your path
INFO: Using /usr/lib/cuda/bin/nvcc as nvcc path
Running CUDA_ARCH: -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=[sm_70,compute_70] --fatbin-options -compress-all
cd /home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core; make libdmlc.a USE_SSE=1 config=/home/yizhao/Code/mxnet-dev/config.mk; cd /home/yizhao/Code/mxnet-dev
make[1]: Entering directory '/home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core'
make[1]: 'libdmlc.a' is up to date.
make[1]: Leaving directory '/home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core'
g++ -DMSHADOW_FORCE_STREAM -Wall -Wsign-compare -g -O0 -D_GLIBCXX_ASSERTIONS -I/home/yizhao/Code/mxnet-dev/3rdparty/mshadow/ -I/home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core/include -fPIC -I/home/yizhao/Code/mxnet-dev/3rdparty/tvm/nnvm/include -I/home/yizhao/Code/mxnet-dev/3rdparty/dlpack/include -I/home/yizhao/Code/mxnet-dev/3rdparty/tvm/include -Iinclude -funroll-loops -Wno-unused-parameter -Wno-unknown-pragmas -Wno-unused-local-typedefs -msse3 -mf16c -I/usr/lib/cuda/include -DMSHADOW_USE_CBLAS=1 -DMSHADOW_USE_MKL=0 -DMSHADOW_RABIT_PS=0 -DMSHADOW_DIST_PS=0 -DMSHADOW_USE_PASCAL=0 -DMXNET_USE_SIGNAL_HANDLER=1 -DMXNET_USE_OPENCV=0 -fopenmp -DMXNET_USE_OPERATOR_TUNING=1 -DMSHADOW_INT64_TENSOR_SIZE=0 -DMSHADOW_USE_CUDNN=1 -I/home/yizhao/Code/mxnet-dev/3rdparty/nvidia_cub -DMXNET_ENABLE_CUDA_RTC=1 -DMXNET_USE_NCCL=0 -DMXNET_USE_LIBJPEG_TURBO=0 -shared -o lib/libmxnet.so build/src/operator/nn/mkldnn/mkldnn_pooling.o build/src/operator/nn/mkldnn/mkldnn_convolution.o build/src/operator/nn/mkldnn/mkldnn_concat.o build/src/operator/nn/mkldnn/mkldnn_base.o build/src/operator/nn/mkldnn/mkldnn_slice.o build/src/operator/nn/mkldnn/mkldnn_reshape.o build/src/operator/nn/mkldnn/mkldnn_act.o build/src/operator/nn/mkldnn/mkldnn_softmax.o build/src/operator/nn/mkldnn/mkldnn_deconvolution.o build/src/operator/nn/mkldnn/mkldnn_copy.o build/src/operator/nn/mkldnn/mkldnn_softmax_output.o build/src/operator/nn/mkldnn/mkldnn_fully_connected.o build/src/operator/nn/mkldnn/mkldnn_transpose.o build/src/operator/nn/mkldnn/mkldnn_sum.o build/src/operator/nn/cudnn/cudnn_algoreg.o build/src/operator/nn/cudnn/cudnn_batch_norm.o build/src/operator/quantization/mkldnn/mkldnn_quantized_elemwise_add.o build/src/operator/quantization/mkldnn/mkldnn_quantized_conv.o build/src/operator/quantization/mkldnn/mkldnn_quantized_act.o build/src/operator/quantization/mkldnn/mkldnn_quantized_fully_connected.o build/src/operator/quantization/mkldnn/mkldnn_quantized_pooling.o build/src/operator/quantization/mkldnn/mkldnn_quantized_concat.o build/src/operator/subgraph/mkldnn/mkldnn_subgraph_property.o build/src/operator/subgraph/mkldnn/mkldnn_conv.o build/src/operator/subgraph/mkldnn/mkldnn_fc.o build/src/operator/subgraph/tensorrt/tensorrt.o build/src/operator/subgraph/tensorrt/onnx_to_tensorrt.o build/src/operator/subgraph/tensorrt/nnvm_to_onnx.o build/src/operator/nnpack/nnpack_util.o build/src/operator/custom/native_op.o build/src/operator/custom/ndarray_op.o build/src/operator/custom/custom.o build/src/operator/image/crop.o build/src/operator/image/image_random.o build/src/operator/image/resize.o build/src/operator/contrib/multibox_target.o build/src/operator/contrib/dgl_graph.o build/src/operator/contrib/count_sketch.o build/src/operator/contrib/nnz.o build/src/operator/contrib/gradient_multiplier_op.o build/src/operator/contrib/adamw.o build/src/operator/contrib/optimizer_op.o build/src/operator/contrib/bilinear_resize.o build/src/operator/contrib/multibox_detection.o build/src/operator/contrib/roi_align.o build/src/operator/contrib/deformable_psroi_pooling.o build/src/operator/contrib/fft.o build/src/operator/contrib/multibox_prior.o build/src/operator/contrib/hawkes_ll.o build/src/operator/contrib/quadratic_op.o build/src/operator/contrib/transformer.o build/src/operator/contrib/all_finite.o build/src/operator/contrib/index_array.o build/src/operator/contrib/multi_proposal.o build/src/operator/contrib/index_copy.o build/src/operator/contrib/krprod.o build/src/operator/contrib/bounding_box.o build/src/operator/contrib/rpn_inv_normalize_op.o build/src/operator/contrib/proposal.o build/src/operator/contrib/amp_graph_pass.o build/src/operator/contrib/boolean_mask.o build/src/operator/contrib/psroi_pooling.o build/src/operator/contrib/deformable_convolution.o build/src/operator/contrib/ifft.o build/src/operator/contrib/sync_batch_norm.o build/src/operator/contrib/adaptive_avg_pooling.o build/src/operator/random/sample_multinomial_op.o build/src/operator/random/multisample_op.o build/src/operator/random/unique_sample_op.o build/src/operator/random/sample_op.o build/src/operator/random/shuffle_op.o build/src/operator/tensor/elemwise_binary_broadcast_op_extended.o build/src/operator/tensor/square_sum.o build/src/operator/tensor/elemwise_binary_op_basic.o build/src/operator/tensor/dot.o build/src/operator/tensor/init_op.o build/src/operator/tensor/elemwise_sum.o build/src/operator/tensor/la_op.o build/src/operator/tensor/histogram.o build/src/operator/tensor/broadcast_reduce_op_index.o build/src/operator/tensor/elemwise_binary_op.o build/src/operator/tensor/elemwise_binary_scalar_op_basic.o build/src/operator/tensor/elemwise_scatter_op.o build/src/operator/tensor/elemwise_binary_scalar_op_extended.o build/src/operator/tensor/elemwise_binary_broadcast_op_basic.o build/src/operator/tensor/elemwise_unary_op_basic.o build/src/operator/tensor/sparse_retain.o build/src/operator/tensor/amp_cast.o build/src/operator/tensor/ordering_op.o build/src/operator/tensor/indexing_op.o build/src/operator/tensor/elemwise_binary_broadcast_op_logic.o build/src/operator/tensor/broadcast_reduce_op_value.o build/src/operator/tensor/elemwise_binary_op_logic.o build/src/operator/tensor/control_flow_op.o build/src/operator/tensor/elemwise_binary_op_extended.o build/src/operator/tensor/matrix_op.o build/src/operator/tensor/diag_op.o build/src/operator/tensor/ravel.o build/src/operator/tensor/elemwise_binary_scalar_op_logic.o build/src/operator/tensor/cast_storage.o build/src/operator/tensor/elemwise_unary_op_trig.o build/src/operator/nn/moments.o build/src/operator/nn/pooling.o build/src/operator/nn/deconvolution.o build/src/operator/nn/activation.o build/src/operator/nn/upsampling.o build/src/operator/nn/ctc_loss.o build/src/operator/nn/fully_connected.o build/src/operator/nn/convolution.o build/src/operator/nn/softmax.o build/src/operator/nn/lrn.o build/src/operator/nn/layer_norm.o build/src/operator/nn/concat.o build/src/operator/nn/softmax_activation.o build/src/operator/nn/batch_norm.o build/src/operator/nn/dropout.o build/src/operator/quantization/quantized_elemwise_add.o build/src/operator/quantization/dequantize.o build/src/operator/quantization/quantized_conv.o build/src/operator/quantization/quantize_graph_pass.o build/src/operator/quantization/quantized_flatten.o build/src/operator/quantization/quantized_fully_connected.o build/src/operator/quantization/quantized_pooling.o build/src/operator/quantization/quantize_v2.o build/src/operator/quantization/quantized_concat.o build/src/operator/quantization/requantize.o build/src/operator/quantization/quantized_activation.o build/src/operator/quantization/quantize.o build/src/operator/subgraph/build_subgraph.o build/src/operator/subgraph/default_subgraph_property_v2.o build/src/operator/subgraph/default_subgraph_property.o build/src/executor/inplace_addto_detect_pass.o build/src/executor/infer_graph_attr_pass.o build/src/executor/graph_executor.o build/src/executor/attach_op_execs_pass.o build/src/executor/attach_op_resource_pass.o build/src/io/image_aug_default.o build/src/io/io.o build/src/io/iter_csv.o build/src/io/iter_image_det_recordio.o build/src/io/image_io.o build/src/io/image_det_aug_default.o build/src/io/iter_image_recordio.o build/src/io/iter_mnist.o build/src/io/iter_image_recordio_2.o build/src/io/iter_libsvm.o build/src/common/utils.o build/src/common/rtc.o build/src/nnvm/gradient.o build/src/nnvm/legacy_op_util.o build/src/nnvm/tvm_bridge.o build/src/nnvm/graph_editor.o build/src/nnvm/legacy_json_util.o build/src/nnvm/plan_memory.o build/src/imperative/cached_op.o build/src/imperative/imperative_utils.o build/src/imperative/imperative.o build/src/ndarray/ndarray_function.o build/src/ndarray/ndarray.o build/src/operator/instance_norm.o build/src/operator/subgraph_op_common.o build/src/operator/grid_generator.o build/src/operator/leaky_relu.o build/src/operator/operator_tune.o build/src/operator/rnn.o build/src/operator/crop.o build/src/operator/spatial_transformer.o build/src/operator/convolution_v1.o build/src/operator/regression_output.o build/src/operator/pad.o build/src/operator/bilinear_sampler.o build/src/operator/loss_binary_op.o build/src/operator/svm_output.o build/src/operator/softmax_output.o build/src/operator/roi_pooling.o build/src/operator/batch_norm_v1.o build/src/operator/cross_device_copy.o build/src/operator/swapaxis.o build/src/operator/l2_normalization.o build/src/operator/sequence_reverse.o build/src/operator/c_lapack_api.o build/src/operator/correlation.o build/src/operator/identity_attach_KL_sparse_reg.o build/src/operator/make_loss.o build/src/operator/operator.o build/src/operator/optimizer_op.o build/src/operator/slice_channel.o build/src/operator/sequence_last.o build/src/operator/pooling_v1.o build/src/operator/sequence_mask.o build/src/operator/control_flow.o build/src/operator/operator_util.o build/src/engine/naive_engine.o build/src/engine/openmp.o build/src/engine/threaded_engine_pooled.o build/src/engine/engine.o build/src/engine/threaded_engine.o build/src/engine/threaded_engine_perdevice.o build/src/storage/storage.o build/src/c_api/c_api_symbolic.o build/src/c_api/c_api_profile.o build/src/c_api/c_api_ndarray.o build/src/c_api/c_api_test.o build/src/c_api/c_api_executor.o build/src/c_api/c_predict_api.o build/src/c_api/c_api_function.o build/src/c_api/c_api.o build/src/c_api/c_api_error.o build/src/profiler/profiler.o build/src/profiler/aggregate_stats.o build/src/profiler/nvtx.o build/src/profiler/vtune.o build/src/kvstore/gradient_compression.o build/src/kvstore/kvstore_utils.o build/src/kvstore/kvstore.o build/src/resource.o build/src/libinfo.o build/src/initialize.o /home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core/libdmlc.a build/src/operator/nn/cudnn/cudnn_batch_norm_gpu.o build/src/operator/subgraph/tensorrt/tensorrt_gpu.o build/src/operator/custom/native_op_gpu.o build/src/operator/image/resize_gpu.o build/src/operator/image/image_random_gpu.o build/src/operator/contrib/rpn_inv_normalize_op_gpu.o build/src/operator/contrib/bilinear_resize_gpu.o build/src/operator/contrib/optimizer_op_gpu.o build/src/operator/contrib/deformable_psroi_pooling_gpu.o build/src/operator/contrib/boolean_mask_gpu.o build/src/operator/contrib/psroi_pooling_gpu.o build/src/operator/contrib/ifft_gpu.o build/src/operator/contrib/multibox_detection_gpu.o build/src/operator/contrib/adaptive_avg_pooling_gpu.o build/src/operator/contrib/multibox_target_gpu.o build/src/operator/contrib/proposal_gpu.o build/src/operator/contrib/index_array_gpu.o build/src/operator/contrib/count_sketch_gpu.o build/src/operator/contrib/gradient_multiplier_op_gpu.o build/src/operator/contrib/bounding_box_gpu.o build/src/operator/contrib/sync_batch_norm_gpu.o build/src/operator/contrib/dgl_graph_gpu.o build/src/operator/contrib/hawkes_ll_gpu.o build/src/operator/contrib/fft_gpu.o build/src/operator/contrib/multibox_prior_gpu.o build/src/operator/contrib/adamw_gpu.o build/src/operator/contrib/quadratic_op_gpu.o build/src/operator/contrib/transformer_gpu.o build/src/operator/contrib/all_finite_gpu.o build/src/operator/contrib/index_copy_gpu.o build/src/operator/contrib/deformable_convolution_gpu.o build/src/operator/contrib/roi_align_gpu.o build/src/operator/contrib/multi_proposal_gpu.o build/src/operator/random/shuffle_op_gpu.o build/src/operator/random/sample_multinomial_op_gpu.o build/src/operator/random/multisample_op_gpu.o build/src/operator/random/sample_op_gpu.o build/src/operator/tensor/indexing_op_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_basic_gpu.o build/src/operator/tensor/amp_cast_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_extended_gpu.o build/src/operator/tensor/ordering_op_gpu.o build/src/operator/tensor/matrix_op_gpu.o build/src/operator/tensor/elemwise_unary_op_trig_gpu.o build/src/operator/tensor/control_flow_op_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_basic_gpu.o build/src/operator/tensor/elemwise_binary_op_extended_gpu.o build/src/operator/tensor/elemwise_sum_gpu.o build/src/operator/tensor/init_op_gpu.o build/src/operator/tensor/cast_storage_gpu.o build/src/operator/tensor/histogram_gpu.o build/src/operator/tensor/broadcast_reduce_op_index_gpu.o build/src/operator/tensor/dot_gpu.o build/src/operator/tensor/elemwise_binary_scalar_op_logic_gpu.o build/src/operator/tensor/elemwise_unary_op_basic_gpu.o build/src/operator/tensor/ravel_gpu.o build/src/operator/tensor/broadcast_reduce_op_value_gpu.o build/src/operator/tensor/elemwise_binary_op_basic_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_extended_gpu.o build/src/operator/tensor/elemwise_scatter_op_gpu.o build/src/operator/tensor/square_sum_gpu.o build/src/operator/tensor/elemwise_binary_broadcast_op_logic_gpu.o build/src/operator/tensor/la_op_gpu.o build/src/operator/tensor/elemwise_binary_op_logic_gpu.o build/src/operator/tensor/diag_op_gpu.o build/src/operator/tensor/sparse_retain_gpu.o build/src/operator/nn/dropout_gpu.o build/src/operator/nn/fully_connected_gpu.o build/src/operator/nn/softmax_activation_gpu.o build/src/operator/nn/lrn_gpu.o build/src/operator/nn/moments_gpu.o build/src/operator/nn/pooling_gpu.o build/src/operator/nn/softmax_gpu.o build/src/operator/nn/deconvolution_gpu.o build/src/operator/nn/activation_gpu.o build/src/operator/nn/ctc_loss_gpu.o build/src/operator/nn/convolution_gpu.o build/src/operator/nn/upsampling_gpu.o build/src/operator/nn/batch_norm_gpu.o build/src/operator/nn/layer_norm_gpu.o build/src/operator/nn/concat_gpu.o build/src/operator/quantization/requantize_gpu.o build/src/operator/quantization/quantize_gpu.o build/src/operator/quantization/dequantize_gpu.o build/src/operator/quantization/quantized_conv_gpu.o build/src/operator/quantization/quantized_flatten_gpu.o build/src/operator/quantization/quantized_fully_connected_gpu.o build/src/operator/quantization/quantized_pooling_gpu.o build/src/operator/quantization/quantize_v2_gpu.o build/src/common/utils_gpu.o build/src/common/random_generator_gpu.o build/src/ndarray/ndarray_function_gpu.o build/src/operator/optimizer_op_gpu.o build/src/operator/slice_channel_gpu.o build/src/operator/instance_norm_gpu.o build/src/operator/pad_gpu.o build/src/operator/correlation_gpu.o build/src/operator/make_loss_gpu.o build/src/operator/grid_generator_gpu.o build/src/operator/convolution_v1_gpu.o build/src/operator/softmax_output_gpu.o build/src/operator/rnn_gpu.o build/src/operator/crop_gpu.o build/src/operator/sequence_reverse_gpu.o build/src/operator/identity_attach_KL_sparse_reg_gpu.o build/src/operator/leaky_relu_gpu.o build/src/operator/swapaxis_gpu.o build/src/operator/sequence_mask_gpu.o build/src/operator/bilinear_sampler_gpu.o build/src/operator/spatial_transformer_gpu.o build/src/operator/pooling_v1_gpu.o build/src/operator/loss_binary_op_gpu.o build/src/operator/roi_pooling_gpu.o build/src/operator/batch_norm_v1_gpu.o build/src/operator/svm_output_gpu.o build/src/operator/regression_output_gpu.o build/src/operator/l2_normalization_gpu.o build/src/operator/sequence_last_gpu.o build/src/kvstore/gradient_compression_gpu.o build/src/kvstore/kvstore_utils_gpu.o -pthread -lm -lcudart -lcublas -lcurand -lcusolver -L/usr/lib/cuda/lib64 -L/usr/lib/cuda/lib -lopenblas -fopenmp -lrt -lcudnn -lcufft -lcuda -lnvrtc -L/usr/local/cuda/lib64/stubs \
-Wl,--whole-archive /home/yizhao/Code/mxnet-dev/3rdparty/tvm/nnvm/lib/libnnvm.a -Wl,--no-whole-archive
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crti.o: in function `_init':
(.init+0x7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against undefined symbol `__gmon_start__'
/usr/bin/ld: /home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core/libdmlc.a(io.o): in function `std::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >::basic_istringstream(std::string const&, std::_Ios_Openmode) [clone .constprop.213]':
io.cc:(.text+0x23): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_ios<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: io.cc:(.text+0x7c): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `VTT for std::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: io.cc:(.text+0xa7): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: io.cc:(.text+0xee): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_streambuf<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: io.cc:(.text+0x114): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: io.cc:(.text+0x173): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_ios<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: io.cc:(.text+0x1cb): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `vtable for std::basic_streambuf<char, std::char_traits<char> >@@GLIBCXX_3.4' defined in .data.rel.ro section in /usr/lib/gcc/x86_64-linux-gnu/4.8/libstdc++.so
/usr/bin/ld: /home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core/libdmlc.a(io.o): in function `dmlc::io::FileSystem::GetInstance(dmlc::io::URI const&)':
io.cc:(.text+0x22d): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `guard variable for dmlc::io::LocalFileSystem::GetInstance()::instance' defined in .bss._ZGVZN4dmlc2io15LocalFileSystem11GetInstanceEvE8instance[_ZGVZN4dmlc2io15LocalFileSystem11GetInstanceEvE8instance] section in /home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core/libdmlc.a(io.o)
/usr/bin/ld: io.cc:(.text+0x23d): relocation truncated to fit: R_X86_64_REX_GOTPCRELX against symbol `dmlc::io::LocalFileSystem::GetInstance()::instance' defined in .bss._ZZN4dmlc2io15LocalFileSystem11GetInstanceEvE8instance[_ZZN4dmlc2io15LocalFileSystem11GetInstanceEvE8instance] section in /home/yizhao/Code/mxnet-dev/3rdparty/dmlc-core/libdmlc.a(io.o)
/usr/bin/ld: io.cc:(.text+0x34b): additional relocation overflows omitted from the output
lib/libmxnet.so: PC-relative offset overflow in PLT entry for `_ZNSt10_Iter_baseIPaLb0EE7_S_baseES0_'
collect2: error: ld returned 1 exit status
make: *** [Makefile:572: lib/libmxnet.so] Error 1
Having the same issue.
A fix PR has been merged. Could you please try to verify if it fixes your problem?
@yuxihu Still Not fixed
Try replace the
ar crv $@ $(filter %.o, $?)
with
ar Scrv $@ $(filter %.o, $?)
in Makefile
. It worked for me.
The root cause of this may be the 4GB limit of static lib generated by ar
. See the 64-bit variant
chapter in wiki.
@hzfan let me try that. Thanks for the suggestion!
@access2rohit Regarding ar
and building the more recent version on CI, https://github.com/apache/incubator-mxnet/blob/cab1dfad37f044d691e7c4ea81d73463cfcf0c8d/ci/docker/install/ubuntu_ar.sh#L35
should get an extra --enable-64-bit-archive
.
See https://sourceware.org/bugzilla/show_bug.cgi?id=14625 And based on the associated patch it seems there's no runtime switch: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blobdiff;f=bfd/archive.c;h=6fc5f1d80f9ef8be456c45b35bcea42ff3436086;hp=53e295eb26c0b66741803400e92df496b096b527;hb=e6cc316af931911da20249e19f9342e5cf8aeeff;hpb=b95a0a3177bcf797c8f5ad6a7d276fb6275352b7
However, in my local tests that didn't fix the issue when using cmake
to build. (Likely cmake doesn't pick up the updated ar
.)
The same problem for me:
verflows omitted from the output libmxnet.so: PC-relative offset overflow in PLT entry for
_ZN5mxnet2op8mxnet_op6KernelINS0_9pick_gradILi3ELb0EEEN7mshadow3gpuEE6LaunchIJPdS9_PfiiNS5_5ShapeILi3EEESC_EEEvPNS5_6StreamIS6EEiDpT'
collect2: error: ld returned 1 exit status
ninja: build stopped: subcommand failed.
`
You should be able to avoid this issue by just building for a single cuda architecture. Look into specifying -DMXNET_CUDA_ARCH=7.0
7for the cmake build etc
@leezu following your advice, the size of generated .so is reduced by 3/4, thanks.
Note: Providing complete information in the most concise form is the best way to get help. This issue template serves as the checklist for essential information to most of the technical issues and bug reports. For non-technical issues and feature requests, feel free to present the information in what you believe is the best form.
For Q & A and discussion, please start a discussion thread at https://discuss.mxnet.io
Description
MXNet Master build for CUDA with DEBUG=1 failing
Environment info (Required)
AWS Base DLAMI (ubuntu 16.04) on a p2.8xlarge
Package used (Python/R/Scala/Julia): (I'm using ...) Python
Build info (Required if built from source)
Compiler (gcc/clang/mingw/visual studio): gcc 5.4 g++ 5.4
MXNet commit hash: (Paste the output of
git rev-parse HEAD
here.) 0af40f7afe44464a7fbd4ac092c4377b69c56918Build config: (Paste the content of config.mk, or the build command.) USE_CUDA=1 USE_CUDNN=1 USE_LAPACK=1 USE_BLAS = openblas USE_OPENCV=1 USE_CUDA_PATH = /usr/local/cuda DEBUG=1
/usr/local/cuda -> /usr/local/cuda-9.0
Error Message:
(Paste the complete error message, including stack trace.)
Minimum reproducible example
(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?