apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.77k stars 6.8k forks source link

Building errors after cmake #21085

Open wilhem opened 2 years ago

wilhem commented 2 years ago

Description

Trying to build mxnet for Ubuntu as explained here: https://mxnet.apache.org/versions/1.9.1/get_started/ubuntu_setup.html I get the following errors (Fehler is the german word for error)

Error Message

/home/gigi/Downloads/mxnet/src/operator/contrib/bilinear_resize-inl.cuh(228): warning: variable "i_numel" was declared but never referenced detected during: instantiation of "void mxnet::op::SpatialUpSamplingBilinearUpdateOutput<xpu,DType,AccReal>(mshadow::Stream *, const std::vector<mxnet::TBlob, std::allocator> &, const std::vector<mxnet::TBlob, std::allocator> &, __nv_bool) [with xpu=mxnet::gpu, DType=float, AccReal=float]" /home/davide/Downloads/mxnet/src/operator/contrib/bilinear_resize-inl.h(217): here instantiation of "void mxnet::op::BilinearSampleOpForward(const nnvm::NodeAttrs &, const mxnet::OpContext &, const std::vector<mxnet::TBlob, std::allocator> &, const std::vector<mxnet::OpReqType, std::allocator> &, const std::vector<mxnet::TBlob, std::allocator> &) [with xpu=mxnet::gpu]" /home/gigi/Downloads/mxnet/src/operator/contrib/bilinear_resize.cu(260): here

/home/gigi/Downloads/mxnet/src/operator/contrib/../../ndarray/../operator/mshadow_op.h(880): error: expected a "("

/home/gigi/Downloads/mxnet/src/operator/contrib/../../ndarray/../operator/mshadow_op.h(880): error: expected a "("

/home/gigi/Downloads/mxnet/src/operator/./mshadow_op.h(880): error: expected a "("

1 error detected in the compilation of "/tmp/tmpxft_00009e98_00000000-4_all_finite.cpp4.ii". /home/gigi/Downloads/mxnet/src/operator/contrib/./../mshadow_op.h(880): error: expected a "("

make[2]: [CMakeFiles/mxnet.dir/build.make:6641: CMakeFiles/mxnet.dir/src/operator/all_finite.cu.o] Fehler 2 make[2]: Auf noch nicht beendete Prozesse wird gewartet … 1 error detected in the compilation of "/tmp/tmpxft_00009ebf_00000000-4_bilinear_resize.cpp4.ii". make[2]: [CMakeFiles/mxnet.dir/build.make:6719: CMakeFiles/mxnet.dir/src/operator/contrib/bilinear_resize.cu.o] Fehler 2 1 error detected in the compilation of "/tmp/tmpxft_00009eb4_00000000-4_adaptive_avg_pooling.cpp4.ii". make[2]: [CMakeFiles/mxnet.dir/build.make:6693: CMakeFiles/mxnet.dir/src/operator/contrib/adaptive_avg_pooling.cu.o] Fehler 2 /home/gigi/Downloads/mxnet/src/operator/contrib/./../mshadow_op.h(880): error: expected a "("

/home/gigi/Downloads/mxnet/src/operator/contrib/./../mshadow_op.h(880): error: expected a "("

1 error detected in the compilation of "/tmp/tmpxft_00009eb6_00000000-4_allclose_op.cpp4.ii". make[2]: [CMakeFiles/mxnet.dir/build.make:6706: CMakeFiles/mxnet.dir/src/operator/contrib/allclose_op.cu.o] Fehler 2 1 error detected in the compilation of "/tmp/tmpxft_00009e96_00000000-4_adabelief.cpp4.ii". make[2]: [CMakeFiles/mxnet.dir/build.make:6667: CMakeFiles/mxnet.dir/src/operator/contrib/adabelief.cu.o] Fehler 2 1 error detected in the compilation of "/tmp/tmpxft_00009e9c_00000000-4_adamw.cpp4.ii". make[2]: *** [CMakeFiles/mxnet.dir/build.make:6680: CMakeFiles/mxnet.dir/src/operator/contrib/adamw.cu.o] Fehler 2 /home/gigi/Downloads/mxnet/src/common/../operator/tensor/./../mshadow_op.h(880): error: expected a "("

1 error detected in the compilation of "/tmp/tmpxft_00009e85_00000000-4_utils.cpp4.ii". make[2]: *** [CMakeFiles/mxnet.dir/build.make:6589: CMakeFiles/mxnet.dir/src/common/utils.cu.o] Fehler 2 /home/gigi/Downloads/mxnet/src/ndarray/../operator/tensor/./../mshadow_op.h(880): error: expected a "("

1 error detected in the compilation of "/tmp/tmpxft_00009e88_00000000-4_ndarray_function.cpp4.ii". make[2]: [CMakeFiles/mxnet.dir/build.make:6628: CMakeFiles/mxnet.dir/src/ndarray/ndarray_function.cu.o] Fehler 2 make[1]: [CMakeFiles/Makefile2:465: CMakeFiles/mxnet.dir/all] Fehler 2 make: *** [Makefile:141: all] Fehler 2

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. just run: cmake --build . --parallel 8 where 8 are my cpus

What have you tried to solve it?

  1. I tried to rerun the script again
  2. By the way here my config.cmake file
#---------------------------------------------
# GPU support
#---------------------------------------------
set(USE_CUDA OFF CACHE BOOL "Build with CUDA support")
set(USE_CUDNN OFF CACHE BOOL "Build with cudnn support, if found")
set(USE_CUTENSOR OFF CACHE BOOL "Build with cutensor support, if found")

# Target NVIDIA GPU achitecture.
# Valid options are "Auto" for autodetection, "All" for all available
# architectures or a list of architectures by compute capability number, such as
# "7.0" or "7.0;7.5" as well as name, such as "Volta" or "Volta;Turing".
# The value specified here is passed to cmake's CUDA_SELECT_NVCC_ARCH_FLAGS to
# obtain the compilation flags for nvcc.
#
# When compiling on a machine without GPU, autodetection will fail and you
# should instead specify the target architecture manually to avoid excessive
# compilation times.
set(MXNET_CUDA_ARCH "Auto" CACHE STRING "Target NVIDIA GPU achitecture")

#---------------------------------------------
# Common libraries
#---------------------------------------------
set(USE_OPENCV ON CACHE BOOL "Build with OpenCV support")
set(OPENCV_ROOT "" CACHE BOOL "OpenCV install path. Supports autodetection.")

set(USE_OPENMP ON CACHE BOOL "Build with Openmp support")

set(USE_ONEDNN ON CACHE BOOL "Build with oneDNN support")

set(USE_LAPACK ON CACHE BOOL "Build with lapack support")

set(USE_TVM_OP OFF CACHE BOOL "Enable use of TVM operator build system.")

#---------------------
# Compilers
#--------------------
# Compilers are usually autodetected. Uncomment and modify the next 3 lines to
# choose manually:

# set(CMAKE_C_COMPILER "" CACHE BOOL "C compiler")
# set(CMAKE_CXX_COMPILER "" CACHE BOOL "C++ compiler")
# set(CMAKE_CUDA_COMPILER "" CACHE BOOL "Cuda compiler (nvcc)")

#---------------------------------------------
# CPU instruction sets: The support is autodetected if turned ON
#---------------------------------------------
set(USE_SSE ON CACHE BOOL "Build with x86 SSE instruction support")
set(USE_F16C ON CACHE BOOL "Build with x86 F16C instruction support")

#----------------------------
# distributed computing
#----------------------------
set(USE_DIST_KVSTORE OFF CACHE BOOL "Build with DIST_KVSTORE support")

#----------------------------
# performance settings
#----------------------------
set(USE_OPERATOR_TUNING ON CACHE BOOL  "Enable auto-tuning of operators")
set(USE_GPERFTOOLS OFF CACHE BOOL "Build with GPerfTools support")
set(USE_JEMALLOC OFF CACHE BOOL "Build with Jemalloc support")

#----------------------------
# additional operators
#----------------------------
# path to folders containing projects specific operators that you don't want to
# put in src/operators
SET(EXTRA_OPERATORS "" CACHE PATH "EXTRA OPERATORS PATH")

#----------------------------
# other features
#----------------------------
# Create C++ interface package
set(USE_CPP_PACKAGE OFF CACHE BOOL "Build C++ Package")

# Use int64_t type to represent index and the number of elements in a tensor
# This will cause performance degradation reported in issue #14496
# Set to 1 for large tensor with tensor size greater than INT32_MAX i.e. 2147483647
set(USE_INT64_TENSOR_SIZE ON CACHE BOOL "Use int64_t to represent the number of elements in a tensor")

# Other GPU features
set(USE_NCCL "Use NVidia NCCL with CUDA" OFF)
set(NCCL_ROOT "" CACHE BOOL "NCCL install path. Supports autodetection.")
set(USE_NVTX ON CACHE BOOL "Build with NVTX support")

Environment

Environment Information ``` /master/tools/diagnose.py | python3 ----------Python Info---------- Version : 3.8.10 Compiler : GCC 9.4.0 Build : ('default', 'Mar 15 2022 12:22:08') Arch : ('64bit', 'ELF') ------------Pip Info----------- No corresponding pip install for current python. ----------MXNet Info----------- No MXNet installed. ----------System Info---------- Platform : Linux-5.4.0-121-generic-x86_64-with-glibc2.29 system : Linux node : grh1 release : 5.4.0-121-generic version : #137-Ubuntu SMP Wed Jun 15 13:33:07 UTC 2022 ----------Hardware Info---------- machine : x86_64 processor : x86_64 Architektur: x86_64 CPU Operationsmodus: 32-bit, 64-bit Byte-Reihenfolge: Little Endian Adressgrößen: 43 bits physical, 48 bits virtual CPU(s): 16 Liste der Online-CPU(s): 0-15 Thread(s) pro Kern: 2 Kern(e) pro Socket: 8 Sockel: 1 NUMA-Knoten: 1 Anbieterkennung: AuthenticAMD Prozessorfamilie: 23 Modell: 8 Modellname: AMD Ryzen 7 2700 Eight-Core Processor Stepping: 2 Frequenzanhebung: aktiviert CPU MHz: 1550.850 Maximale Taktfrequenz der CPU: 3200,0000 Minimale Taktfrequenz der CPU: 1550,0000 BogoMIPS: 6386.98 Virtualisierung: AMD-V L1d Cache: 256 KiB L1i Cache: 512 KiB L2 Cache: 4 MiB L3 Cache: 16 MiB NUMA-Knoten0 CPU(s): 0-15 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP disabled, RSB filling Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Markierungen: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor s sse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_ legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit w dt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate s me ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushop t sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter p fthreshold avic v_vmsave_vmload vgif overflow_recov succor smca ----------Network Test---------- Setting timeout: 10 Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0170 sec, LOAD: 0.4900 sec. Error open Gluon Tutorial(en): http://gluon.mxnet.io, HTTP Error 404: Not Found, DNS finished in 0.0477757453918457 sec. Error open Gluon Tutorial(cn): https://zh.gluon.ai, , DNS finished in 0.01242971420288086 sec. Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0225 sec, LOAD: 0.6333 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0057 sec, LOAD: 0.3582 sec. Error open Conda: https://repo.continuum.io/pkgs/free/, HTTP Error 403: Forbidden, DNS finished in 0.09440469741821289 sec. ----------Environment---------- ```
github-actions[bot] commented 2 years ago

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

josephevans commented 2 years ago

Hi, thanks for the report. What compiler are you using to build MXNet?