ROCm / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration
http://pytorch.org
Other
219 stars 55 forks source link

hip/hip_version not found and thrust_ptr not found, bad config from fresh install #626

Open fuag15 opened 4 years ago

fuag15 commented 4 years ago

🐛 Bug

I'm getting error building on a gentoo host with these errors. Mostly around hip_version and around thrust_ptr not found. It seems like a problem with the hipify scripts but i'm way out of my league here in guessing.

To Reproduce

  1. install required libs
  2. clone repo on master
  3. run amd_build.py helper script
  4. python setup.py install
  5. observer error around hip_version.h and thrust_ptr.h

` (base) moose@~/code/pytorch (master) ∫ USE_ROCM=1 USE_LMDB=1 USE_OPENCV=1 MAX_JOBS=4 python setup.py install Building wheel torch-1.6.0a0+93432cf -- Building version 1.6.0a0+93432cf cmake --build . --target install --config Release -- -j 4 [0/1] Re-running CMake... -- std::exception_ptr is supported. -- Turning off deprecation warning due to glog. -- Current compiler supports avx2 extension. Will build perfkernels. -- Current compiler supports avx512f extension. Will build fbgemm. -- Building using own protobuf under third_party per request. -- Use custom protobuf build.

-- 3.11.4.0 -- Caffe2 protobuf include directory: $<BUILD_INTERFACE:/home/moose/code/pytorch/third_party/protobuf/src>$ -- Trying to find preferred BLAS backend of choice: MKL -- MKL_THREADING = OMP -- MKL_THREADING = OMP CMake Warning at cmake/Dependencies.cmake:145 (message): MKL could not be found. Defaulting to Eigen Call Stack (most recent call first): CMakeLists.txt:418 (include)

CMake Warning at cmake/Dependencies.cmake:163 (message): Preferred BLAS (MKL) cannot be found, now searching for a general BLAS library Call Stack (most recent call first): CMakeLists.txt:418 (include)

-- MKL_THREADING = OMP -- Checking for [mkl_intel_lp64 - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf_lp64 - mkl_intel_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_gnu_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_gf - mkl_intel_thread - mkl_core - gomp - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_intel_lp64 - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf_lp64 - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_gnu_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_gf - mkl_intel_thread - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_intel_lp64 - mkl_gnu_thread - mkl_core - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel_lp64 - mkl_intel_thread - mkl_core - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_gnu_thread - mkl_core - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_intel - mkl_intel_thread - mkl_core - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_gnu_thread - mkl_core - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf_lp64 - mkl_intel_thread - mkl_core - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_gnu_thread - mkl_core - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_gf - mkl_intel_thread - mkl_core - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_intel_lp64 - mkl_sequential - mkl_core - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_sequential - mkl_core - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_sequential - mkl_core - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_sequential - mkl_core - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_intel_lp64 - mkl_core - gomp - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - gomp - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_core - gomp - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_core - gomp - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_intel_lp64 - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_core - iomp5 - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl_intel_lp64 - mkl_core - pthread - m - dl] -- Library mkl_intel_lp64: not found -- Checking for [mkl_intel - mkl_core - pthread - m - dl] -- Library mkl_intel: not found -- Checking for [mkl_gf_lp64 - mkl_core - pthread - m - dl] -- Library mkl_gf_lp64: not found -- Checking for [mkl_gf - mkl_core - pthread - m - dl] -- Library mkl_gf: not found -- Checking for [mkl - guide - pthread - m] -- Library mkl: not found -- MKL library not found -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [Accelerate] -- Library Accelerate: BLAS_Accelerate_LIBRARY-NOTFOUND -- Checking for [vecLib] -- Library vecLib: BLAS_vecLib_LIBRARY-NOTFOUND -- Checking for [openblas] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [openblas - pthread] -- Library openblas: BLAS_openblas_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [goto2 - gfortran - pthread] -- Library goto2: BLAS_goto2_LIBRARY-NOTFOUND -- Checking for [acml - gfortran] -- Library acml: BLAS_acml_LIBRARY-NOTFOUND -- Checking for [blis] -- Library blis: BLAS_blis_LIBRARY-NOTFOUND -- Checking for [ptf77blas - atlas - gfortran] -- Library ptf77blas: BLAS_ptf77blas_LIBRARY-NOTFOUND -- Checking for [blas] -- Library blas: /usr/lib64/libblas.so -- Found a library with BLAS API (generic). -- Brace yourself, we are building NNPACK -- Found PythonInterp: /usr/lib/python-exec/python3.6/python (found version "3.6.10") -- NNPACK backend is x86-64 -- LLVM FileCheck Found: /usr/lib/llvm/9/bin/FileCheck -- git Version: v1.4.0-505be96a -- Version: 1.4.0 -- Performing Test HAVE_STD_REGEX -- success -- Performing Test HAVE_GNU_POSIX_REGEX -- failed to compile -- Performing Test HAVE_POSIX_REGEX -- success -- Performing Test HAVE_STEADY_CLOCK -- success CMake Warning at third_party/fbgemm/CMakeLists.txt:81 (message): OpenMP found! OpenMP_C_INCLUDE_DIRS =

CMake Warning at third_party/fbgemm/CMakeLists.txt:193 (message):

CMake Warning at third_party/fbgemm/CMakeLists.txt:194 (message): CMAKE_BUILD_TYPE = Release

CMake Warning at third_party/fbgemm/CMakeLists.txt:195 (message): CMAKE_CXX_FLAGS_DEBUG is -g

CMake Warning at third_party/fbgemm/CMakeLists.txt:196 (message): CMAKE_CXX_FLAGS_RELEASE is -O3 -DNDEBUG

CMake Warning at third_party/fbgemm/CMakeLists.txt:197 (message):

AsmJit Summary ASMJIT_DIR=/home/moose/code/pytorch/third_party/fbgemm/third_party/asmjit ASMJIT_TEST=FALSE ASMJIT_TARGET_TYPE=STATIC ASMJIT_DEPS=pthread;rt ASMJIT_LIBS=asmjit;pthread;rt ASMJIT_CFLAGS=-DASMJIT_STATIC ASMJIT_PRIVATE_CFLAGS=-Wall;-Wextra;-fno-math-errno;-fno-threadsafe-statics;-DASMJIT_STATIC ASMJIT_PRIVATE_CFLAGS_DBG= ASMJIT_PRIVATE_CFLAGS_REL=-O2;-fmerge-all-constants -- Found Numa (include: /usr/include, library: /usr/lib64/libnuma.so) -- Using third party subdirectory Eigen. Python 3.6.10 -- Found PythonInterp: /usr/lib/python-exec/python3.6/python (found suitable version "3.6.10", minimum required is "2.7") -- Could NOT find pybind11 (missing: pybind11_DIR) -- Could NOT find pybind11 (missing: pybind11_INCLUDE_DIR) -- Using third_party/pybind11. -- pybind11 include dirs: /home/moose/code/pytorch/cmake/../third_party/pybind11/include -- Adding OpenMP CXX_FLAGS: -fopenmp -- No OpenMP library needs to be linked against HIP VERSION: 3.0.20144-

Library versions from dpkg

Library versions from cmake find_package

rocrand VERSION: 2.10.0.0 hiprand VERSION: 2.10.0.0 rocblas VERSION: 2.12.1.0 miopen VERSION: 2.2.0.0 rocfft VERSION: 0.9.9.0 hipsparse VERSION: 1.3.2.0 rccl VERSION: 2.7.0.0 rocprim VERSION: 2.9.0.0 hipcub VERSION: 2.9.0.0 rocthrust VERSION: 2.9.0.0 INFOCompiling with HIP for AMD. -- RCCL Found! Successfully preprocessed all matching files. Generated: /home/moose/code/pytorch/build/third_party/onnx/onnx/onnx_onnx_torch-ml.proto Generated: /home/moose/code/pytorch/build/third_party/onnx/onnx/onnx-operators_onnx_torch-ml.proto

-- **** Summary **** -- CMake version : 3.16.5 -- CMake command : /usr/bin/cmake -- System : Linux -- C++ compiler : /usr/bin/c++ -- C++ compiler version : 9.2.0 -- CXX flags : -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -Wnon-virtual-dtor -- Build type : Release -- Compile definitions : NDEBUG;ONNX_ML=1 -- CMAKE_PREFIX_PATH : /usr/lib/hcc/3.0/lib/cmake -- CMAKE_INSTALL_PREFIX : /home/moose/code/pytorch/torch -- CMAKE_MODULE_PATH : /usr/lib/hip/cmake;/home/moose/code/pytorch/cmake/Modules

-- ONNX version : 1.6.0 -- ONNX NAMESPACE : onnx_torch -- ONNX_BUILD_TESTS : OFF -- ONNX_BUILD_BENCHMARKS : OFF -- ONNX_USE_LITE_PROTO : OFF -- ONNXIFI_DUMMY_BACKEND : OFF -- ONNXIFI_ENABLE_EXT : OFF

-- Protobuf compiler : -- Protobuf includes : -- Protobuf libraries : -- BUILD_ONNX_PYTHON : OFF

-- **** Summary **** -- CMake version : 3.16.5 -- CMake command : /usr/bin/cmake -- System : Linux -- C++ compiler : /usr/bin/c++ -- C++ compiler version : 9.2.0 -- CXX flags : -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -Wnon-virtual-dtor -- Build type : Release -- Compile definitions : NDEBUG;ONNX_ML=1 -- CMAKE_PREFIX_PATH : /usr/lib/hcc/3.0/lib/cmake -- CMAKE_INSTALL_PREFIX : /home/moose/code/pytorch/torch -- CMAKE_MODULE_PATH : /usr/lib/hip/cmake;/home/moose/code/pytorch/cmake/Modules

-- ONNX version : 1.4.1 -- ONNX NAMESPACE : onnx_torch -- ONNX_BUILD_TESTS : OFF -- ONNX_BUILD_BENCHMARKS : OFF -- ONNX_USE_LITE_PROTO : OFF -- ONNXIFI_DUMMY_BACKEND : OFF

-- Protobuf compiler : -- Protobuf includes : -- Protobuf libraries : -- BUILD_ONNX_PYTHON : OFF -- Could not find CUDA with FP16 support, compiling without torch.CudaHalfTensor -- Adding -DNDEBUG to compile flags -- MAGMA not found. Compiling without MAGMA support -- Could not find hardware support for NEON on this machine. -- No OMAP3 processor on this machine. -- No OMAP4 processor on this machine. -- AVX compiler support found -- AVX2 compiler support found -- Found a library with LAPACK API (generic). disabling CUDA because NOT USE_CUDA is set -- USE_CUDNN is set to 0. Compiling without cuDNN support -- MKLDNN_CPU_RUNTIME = OMP -- GPU support is disabled -- Primitive cache is enabled -- Found MKL-DNN: TRUE -- GCC 9.2.0: Adding gcc and gcc_s libs to link line -- NUMA paths: -- /usr/include -- /usr/lib64/libnuma.so HIP VERSION: 3.0.20144-

Library versions from dpkg

Library versions from cmake find_package

rocrand VERSION: 2.10.0.0 hiprand VERSION: 2.10.0.0 rocblas VERSION: 2.12.1.0 miopen VERSION: 2.2.0.0 rocfft VERSION: 0.9.9.0 hipsparse VERSION: 1.3.2.0 rccl VERSION: 2.7.0.0 rocprim VERSION: 2.9.0.0 hipcub VERSION: 2.9.0.0 rocthrust VERSION: 2.9.0.0 ROCm is enabled. CMake Deprecation Warning at third_party/sleef/CMakeLists.txt:20 (cmake_policy): The OLD behavior for policy CMP0066 will be removed from a future version of CMake.

The cmake-policies(7) manual explains that the OLD behaviors of all policies are deprecated and that a policy should be set to OLD only under specific short-term circumstances. Projects should be ported to the NEW behavior and not rely on setting a policy to OLD.

-- Configuring build for SLEEF-v3.4.0 Target system: Linux-5.4.28-gentoo Target processor: x86_64 Host system: Linux-5.4.28-gentoo Host processor: x86_64 Detected C compiler: GNU @ /usr/bin/cc -- Using option -Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math to compile libsleef -- Building shared libs : OFF -- MPFR : /usr/lib64/libmpfr.so -- MPFR header file in /usr/include -- GMP : /usr/lib64/libgmp.so -- RT : /usr/lib64/librt.so -- FFTW3 : LIBFFTW3-NOTFOUND -- OPENSSL : 1.1.1f -- SDE : SDE_COMMAND-NOTFOUND -- RUNNING_ON_TRAVIS : 0 -- COMPILER_SUPPORTS_OPENMP : 1 AT_INSTALL_INCLUDE_DIR include/ATen/core core header install: /home/moose/code/pytorch/build/aten/src/ATen/core/TensorBody.h core header install: /home/moose/code/pytorch/build/aten/src/ATen/core/TensorMethods.h -- Include AMD RCCL operators -- Including IDEEP operators -- Excluding image processing operators due to no opencv -- Excluding video processing operators due to no opencv -- MPI operators skipped due to no MPI support -- Include Observer library -- /usr/bin/c++ /home/moose/code/pytorch/caffe2/../torch/abi-check.cpp -o /home/moose/code/pytorch/build/abi-check -- Determined _GLIBCXX_USE_CXX11_ABI=1 -- pytorch is compiling with OpenMP. OpenMP CXX_FLAGS: -fopenmp. OpenMP libraries: /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/libgomp.so;/usr/lib64/libpthread.so. -- Caffe2 is compiling with OpenMP. OpenMP CXX_FLAGS: -fopenmp. OpenMP libraries: /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0/libgomp.so;/usr/lib64/libpthread.so. -- Using ATen parallel backend: OMP -- Using lib64/python3.6/site-packages as python relative installation path CMake Warning at CMakeLists.txt:626 (message): Generated cmake files are only fully tested if one builds with system glog, gflags, and protobuf. Other settings may generate files that are not well tested.

-- -- **** Summary **** -- General: -- CMake version : 3.16.5 -- CMake command : /usr/bin/cmake -- System : Linux -- C++ compiler : /usr/bin/c++ -- C++ compiler id : GNU -- C++ compiler version : 9.2.0 -- BLAS : MKL -- CXX flags : -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow -- Build type : Release -- Compile definitions : NDEBUG;ONNX_ML=1;ONNX_NAMESPACE=onnx_torch;HAVE_MMAP=1;_FILE_OFFSET_BITS=64;HAVE_SHM_OPEN=1;HAVE_SHM_UNLINK=1;HAVE_MALLOC_USABLE_SIZE=1;MINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -- CMAKE_PREFIX_PATH : /usr/lib/hcc/3.0/lib/cmake -- CMAKE_INSTALL_PREFIX : /home/moose/code/pytorch/torch

-- TORCH_VERSION : 1.6.0 -- CAFFE2_VERSION : 1.6.0 -- BUILD_CAFFE2_MOBILE : OFF -- USE_STATIC_DISPATCH : OFF -- BUILD_BINARY : OFF -- BUILD_CUSTOM_PROTOBUF : ON -- Link local protobuf : ON -- BUILD_DOCS : OFF -- BUILD_PYTHON : True -- Python version : 3.6.10 -- Python executable : /usr/lib/python-exec/python3.6/python -- Pythonlibs version : 3.6.10 -- Python library : /usr/lib64/libpython3.6m.so.1.0 -- Python includes : /usr/include/python3.6m -- Python site-packages: lib64/python3.6/site-packages -- BUILD_CAFFE2_OPS : ON -- BUILD_SHARED_LIBS : ON -- BUILD_TEST : True -- BUILD_JNI : OFF -- INTERN_BUILD_MOBILE : -- USE_ASAN : OFF -- USE_CUDA : OFF -- USE_ROCM : ON -- USE_EIGEN_FOR_BLAS : ON -- USE_FBGEMM : ON -- USE_FFMPEG : OFF -- USE_GFLAGS : OFF -- USE_GLOG : OFF -- USE_LEVELDB : OFF -- USE_LITE_PROTO : OFF -- USE_LMDB : OFF -- USE_METAL : OFF -- USE_MKL : OFF -- USE_MKLDNN : ON -- USE_MKLDNN_CBLAS : OFF -- USE_NCCL : ON -- USE_SYSTEM_NCCL : ON -- USE_NNPACK : ON -- USE_NUMPY : ON -- USE_OBSERVERS : ON -- USE_OPENCL : OFF -- USE_OPENCV : OFF -- USE_OPENMP : ON -- USE_TBB : OFF -- USE_PROF : OFF -- USE_QNNPACK : ON -- USE_PYTORCH_QNNPACK : ON -- USE_REDIS : OFF -- USE_ROCKSDB : OFF -- USE_ZMQ : OFF -- USE_DISTRIBUTED : ON -- USE_MPI : OFF -- USE_GLOO : ON -- Public Dependencies : Threads::Threads;caffe2::mkldnn -- Private Dependencies : cpuinfo;qnnpack;pytorch_qnnpack;nnpack;XNNPACK;fbgemm;/usr/lib64/libnuma.so;fp16;gloo;aten_op_header_gen;foxi_loader;rt;gcc_s;gcc;dl -- Configuring done -- Generating done -- Build files have been written to: /home/moose/code/pytorch/build [12/881] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir.../hip/torch_hip_generated_local_response_normalization_op.hip.o FAILED: caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_local_response_normalization_op.hip.o cd /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip && /usr/bin/cmake -E make_directory /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_local_response_normalization_op.hip.o -P /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_local_response_normalization_op.hip.o.cmake In file included from /home/moose/code/pytorch/caffe2/operators/hip/local_response_normalization_op.hip:2: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. In file included from /home/moose/code/pytorch/caffe2/operators/hip/local_response_normalization_op.hip:2: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. CMake Error at torch_hip_generated_local_response_normalization_op.hip.o.cmake:174 (message): Error generating file /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_local_response_normalization_op.hip.o

[13/881] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_roi_align_op.hip.o FAILED: caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_roi_align_op.hip.o cd /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip && /usr/bin/cmake -E make_directory /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_roi_align_op.hip.o -P /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_roi_align_op.hip.o.cmake In file included from /home/moose/code/pytorch/caffe2/operators/hip/roi_align_op.hip:6: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. In file included from /home/moose/code/pytorch/caffe2/operators/hip/roi_align_op.hip:6: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. CMake Error at torch_hip_generated_roi_align_op.hip.o.cmake:174 (message): Error generating file /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_roi_align_op.hip.o

[14/881] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_sin_op.hip.o FAILED: caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_sin_op.hip.o cd /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip && /usr/bin/cmake -E make_directory /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_sin_op.hip.o -P /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_sin_op.hip.o.cmake In file included from /home/moose/code/pytorch/caffe2/operators/hip/sin_op.hip:7: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. In file included from /home/moose/code/pytorch/caffe2/operators/hip/sin_op.hip:2: In file included from /home/moose/code/pytorch/caffe2/operators/sin_op.h:6: In file included from /home/moose/code/pytorch/caffe2/operators/elementwise_ops.h:15: In file included from /home/moose/code/pytorch/caffe2/utils/eigen_utils.h:6: In file included from /home/moose/code/pytorch/cmake/../third_party/eigen/Eigen/Core:202: /home/moose/code/pytorch/cmake/../third_party/eigen/Eigen/src/Core/arch/GPU/PacketMathHalf.h:149:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ In file included from /home/moose/code/pytorch/caffe2/operators/hip/sin_op.hip:7: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 warning and 1 error generated. CMake Error at torch_hip_generated_sin_op.hip.o.cmake:174 (message): Error generating file /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_sin_op.hip.o

[15/881] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_generate_proposals_op.hip.o FAILED: caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_generate_proposals_op.hip.o cd /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip && /usr/bin/cmake -E make_directory /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/. && /usr/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_generate_proposals_op.hip.o -P /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/torch_hip_generated_generate_proposals_op.hip.o.cmake In file included from /home/moose/code/pytorch/caffe2/operators/hip/generate_proposals_op.hip:4: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. In file included from /home/moose/code/pytorch/caffe2/operators/hip/generate_proposals_op.hip:4: In file included from /home/moose/code/pytorch/caffe2/core/hip/context_gpu.h:8: /home/moose/code/pytorch/caffe2/core/hip/common_gpu.h:98:10: error: use of undeclared identifier 'HIP_VERSION'; did you mean '_SC_VERSION'? return HIP_VERSION; ^~~ _SC_VERSION /usr/include/bits/confname.h:131:5: note: '_SC_VERSION' declared here _SC_VERSION, ^ 1 error generated. CMake Error at torch_hip_generated_generate_proposals_op.hip.o.cmake:174 (message): Error generating file /home/moose/code/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/operators/hip/./torch_hip_generated_generate_proposals_op.hip.o

ninja: build stopped: subcommand failed. Traceback (most recent call last): File "setup.py", line 734, in build_deps() File "setup.py", line 316, in build_deps cmake=cmake) File "/home/moose/code/pytorch/tools/build_pytorch_libs.py", line 62, in build_caffe2 cmake.build(my_env) File "/home/moose/code/pytorch/tools/setup_helpers/cmake.py", line 340, in build self.run(build_args, my_env) File "/home/moose/code/pytorch/tools/setup_helpers/cmake.py", line 141, in run check_call(command, cwd=self.build_dir, env=env) File "/home/moose/anaconda3/lib/python3.7/subprocess.py", line 363, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['cmake', '--build', '.', '--target', 'install', '--config', 'Release', '--', '-j', '4']' returned non-zero exit status 1. `

Expected behavior

Would expect it pytorch to build

Environment

python ./torch/utils/collect_env.py Collecting environment information... PyTorch version: N/A Is debug build: N/A CUDA used to build PyTorch: N/A

OS: Gentoo/Linux GCC version: (Gentoo 9.2.0-r2 p3) 9.2.0 CMake version: version 3.16.5

Python version: 3.6 Is CUDA available: N/A CUDA runtime version: Could not collect GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Could not collect

Versions of relevant libraries: [pip] Could not collect [conda] blas 1.0 mkl [conda] magma-cuda90 2.5.0 1 pytorch [conda] mkl 2020.0 166 [conda] mkl-include 2020.0 166 [conda] mkl-service 2.3.0 py37he904b0f_0 [conda] mkl_fft 1.0.15 py37ha843d7b_0 [conda] mkl_random 1.1.0 py37hd6b4f25_0 [conda] numpy 1.18.1 py37h4f9e942_0 [conda] numpy-base 1.18.1 py37hde5b4d6_1 [conda] numpydoc 0.9.2 py_0

Additional context

I hope this helps, would be happy to try things out and report back or get more info.

iotamudelta commented 4 years ago

hip_version.h was introduced as part of ROCm 3.1 and will be used for versioning changes in PyTorch on ROCm going forward. Are you per chance on an older ROCm than 3.1? If yes, a possible workaround is the create a hip_version.h file in ${ROCM_DIR}/include/hip/ containing HIP_VERSION=300 (if you are on ROCm 3.0), HIP_VERSION=210 (if you are on ROCm 2.10).

Concerning thrust_ptr: can you confirm you have rocThrust installed?

fuag15 commented 4 years ago

Yes i am, I'm using rocm 3.0 but i have access up to rocm 3.3. I was told that rocm 3.0 was the most reliable for getting a build up, would you suggest upgrading to 3.1 or all the way to 3.3? Thanks for the quick response and helping with this project, really excited to get some open source ml going on AMD hardware

fuag15 commented 4 years ago

I do have rocThrust installed however the thrust_ptr error only comes up with parallel builds on and only sometimes, so i'm unsure if its a red-herring, I will try to resolve the HIP_VERSION error, (either by upgrading to 3.1/3.3 or adding that .h file) today and see how things change / report back. I'm curious what your recommendation is on version to target for pytorch.

iotamudelta commented 4 years ago

In general we recommend using either the latest ROCm (this would be 3.3) or what is currently running on the upstream PyTorch CI (this is 3.1.1).

fuag15 commented 4 years ago

I'll post an error elsewhere but now i'm getting issues that seem to have to do with rocBLAS-3.3.0 when compiling against miopen or rocALUTION related to udefined refrerences to llvm::yaml::IO::getContext

fuag15 commented 4 years ago

i've spent the weekend digging into the issues with rocm 3.3.0 to no avail. I've documented the issues as best i can in this issue rocBLAS https://github.com/ROCmSoftwarePlatform/rocBLAS/issues/1124

I'm going to try building against 3.1 in the meantime