NVlabs / nvdiffrast

Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering
Other
1.31k stars 139 forks source link

New CUDA rasterizer doesn't compile with `__CUDA_ARCH__ < 700` #88

Closed nicolas-guichard closed 1 year ago

nicolas-guichard commented 1 year ago

Trying to run nvdiffrast on a GTX 1050 Mobile (compute capability 6.1), I now get a CUDA build error:

/usr/local/cuda/bin/nvcc -ccbin gcc-11 -DTORCH_EXTENSION_NAME=nvdiffrast_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -isystem $VENV/lib64/python3.10/site-packages/torch/include -isystem $VENV/lib64/python3.10/site-packages/torch/include/torch/csrc/api/include -isystem $VENV/lib64/python3.10/site-packages/torch/include/TH -isystem $VENV/lib64/python3.10/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -DNVDR_TORCH -lineinfo -std=c++14 -c $VENV/lib64/python3.10/site-packages/nvdiffrast/common/cudaraster/impl/RasterImpl.cu -o RasterImpl.cuda.o
$VENV/lib64/python3.10/site-packages/nvdiffrast/common/cudaraster/impl/BinRaster.inl(219): error: identifier "__match_any_sync" is undefined

1 error detected in the compilation of "$VENV/lib64/python3.10/site-packages/nvdiffrast/common/cudaraster/impl/RasterImpl.cu

Indeed __match_any_sync usage is guarded in common.h with an explicit #if __CUDA_ARCH__ >= 700 // Warp match instruction __match_any_sync() is only available on compute capability 7.x and higher, but there is no such check in BinRaster.inl.

https://github.com/NVlabs/nvdiffrast/blob/a1ec436b449aa2731b560dc096bc824d5ba958ab/nvdiffrast/common/cudaraster/impl/BinRaster.inl#L219

s-laine commented 1 year ago

Right - I didn't test on any pre-7.0 hardware so didn't notice that. As a quick fix, you can comment out or remove the entire if block, i.e., lines 216-222 in BinRaster.inl. That's just an optimization for cases where every triangle in a warp overlaps exactly one bin. I'll fix this one way or another in the next release.

Thanks for the report!

s-laine commented 1 year ago

Fixed now in latest release - closing.