Error Compiling from Source with GPU on MacOS

zhanghang1989 commented 3 years ago

🐛 Bug

Got the following error when build from source with GPU on MacOS torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu(1210): error: expression must have a constant value

To Reproduce

Steps to reproduce the behavior:

Use Mac GPU machine
Build the latest PyTorch from source
Follow the instruction to build torchvision from source

MACOSX_DEPLOYMENT_TARGET=10.9 CC=clang CXX=clang++ python setup.py install

Expected behavior

FAILED: /Users/hangzhang/git/vision/build/temp.macosx-10.7-x86_64-3.7/Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.o 
/usr/local/cuda/bin/nvcc  -DWITH_CUDA -I/Users/hangzhang/git/vision/torchvision/csrc -I/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include -I/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include/TH -I/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/Users/hangzhang/anaconda3/include/python3.7m -c -c /Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu -o /Users/hangzhang/git/vision/build/temp.macosx-10.7-x86_64-3.7/Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_clang"' '-DPYBIND11_STDLIB="_libcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1002"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 -ccbin clang -std=c++14
/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include/c10/util/BFloat16.h(57): warning: calling a __host__ function from a __host__ __device__ function is not allowed

/Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu(1210): error: expression must have a constant value
/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include/ATen/core/op_registration/op_whitelist.h(62): note: cannot call non-constexpr function "__builtin_expect" (declared implicitly)

/Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu(1209): error: more than one instance of overloaded function "torch::Library::impl" matches the argument list:
            function template "torch::Library &torch::Library::impl(torch::detail::SelectiveStr<false>, Func &&) &"
            function template "torch::Library &torch::Library::impl(torch::detail::SelectiveStr<true>, Func &&) &"
            argument types are: (torch::detail::SelectiveStr<<error-constant>>, c10::CompileTimeFunctionPointer<std::__1::remove_pointer_t<std::__1::remove_reference_t<at::Tensor (const at::Tensor &, const at::Tensor &, const at::Tensor &, const at::Tensor &, const at::Tensor &, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, bool)>>, vision::ops::<unnamed>::deform_conv2d_forward_kernel>)
            object type is: torch::Library

/Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu(1213): error: expression must have a constant value
/Users/hangzhang/anaconda3/lib/python3.7/site-packages/torch/include/ATen/core/op_registration/op_whitelist.h(62): note: cannot call non-constexpr function "__builtin_expect" (declared implicitly)

/Users/hangzhang/git/vision/torchvision/csrc/ops/cuda/deform_conv2d_kernel.cu(1212): error: more than one instance of overloaded function "torch::Library::impl" matches the argument list:
            function template "torch::Library &torch::Library::impl(torch::detail::SelectiveStr<false>, Func &&) &"
            function template "torch::Library &torch::Library::impl(torch::detail::SelectiveStr<true>, Func &&) &"
            argument types are: (torch::detail::SelectiveStr<<error-constant>>, c10::CompileTimeFunctionPointer<std::__1::remove_pointer_t<std::__1::remove_reference_t<std::__1::tuple<at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor> (const at::Tensor &, const at::Tensor &, const at::Tensor &, const at::Tensor &, const at::Tensor &, const at::Tensor &, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, int64_t, bool)>>, vision::ops::<unnamed>::deform_conv2d_backward_kernel>)
            object type is: torch::Library

Environment

Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py

Collecting environment information... PyTorch version: 1.8.0a0+eaf5ca0 Is debug build: False CUDA used to build PyTorch: 10.1 ROCM used to build PyTorch: N/A

OS: macOS 10.13.6 (x86_64) GCC version: Could not collect Clang version: Could not collect CMake version: version 3.14.0

Python version: 3.7 (64-bit runtime) Is CUDA available: True CUDA runtime version: 10.1.105 GPU models and configuration: TITAN Xp Nvidia driver version: 1.1.0 cuDNN version: Probably one of the following: /usr/local/cuda/lib/libcudnn.7.dylib /usr/local/cuda/lib/libcudnn.dylib /usr/local/cuda/lib/libcudnn_static.a HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] autotorch==0.0.2 [pip3] numpy==1.16.2 [pip3] numpydoc==1.1.0 [pip3] torch==1.8.0a0+unknown [pip3] torch-encoding==1.2.2b20210127 [conda] autotorch 0.0.2 dev_0 [conda] blas 1.0 mkl
[conda] mkl 2019.4 233
[conda] mkl-include 2020.2 260
[conda] mkl-service 2.3.0 py37hfbe908c_0
[conda] mkl_fft 1.1.0 py37hc64f4ea_0
[conda] mkl_random 1.1.1 py37h959d312_0
[conda] numpy 1.15.2 pypi_0 pypi [conda] numpy-base 1.16.2 py37h6575580_0
[conda] numpydoc 1.1.0 py_0
[conda] torch 1.4.0a0+7404463 pypi_0 pypi [conda] torch-encoding 1.2.2b20210127 dev_0

zhanghang1989 commented 3 years ago

Error still exists in v0.9.0

hubutui commented 3 years ago

any solution yet? I also got similar issues when building for ArchLinux, check the complete build log here.

pytorch / vision