Open dengchenlong opened 3 weeks ago
As long as STDGPU_BACKEND=STDGPU_BACKEND_OPENMP
, everything can work normally.
I can reproduce these compilation errors on Ubuntu 22.04 + CUDA 12.5 + latest commit from master branch. Furthermore, only the CUDA backend seems to be affected and, more precisely, I suspect that the problem might be locally somewhere in thrust since several CUDA-only expressions coming from there are incorrectly used during the compilation of a .cpp
file (in your case iterator.cpp
).
A very similar error in Open3D has also been reported but within a different part of it: https://github.com/isl-org/Open3D/issues/6813
I saw the same issue when I built Open3D when turning -DBUILD_CUDA_MODULE=ON on Ubuntu 22.04 + CUDA 12.5 with the Open3D/stdgpu cmake setup: GIT_REPOSITORY https://github.com/stotko/stdgpu.git GIT_TAG master
In file included from /usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/barrier_cluster.h:30,
from /usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx.h:74,
from /usr/local/cuda/include/cuda/ptx:19,
from /usr/local/cuda/include/cuda/discard_memory:25,
from /usr/local/cuda/include/cub/util_device.cuh:57,
from /usr/local/cuda/include/thrust/system/cuda/detail/util.h:48,
from /usr/local/cuda/include/thrust/system/cuda/detail/malloc_and_free.h:34,
from /usr/local/cuda/include/thrust/system/detail/adl/malloc_and_free.h:50,
from /usr/local/cuda/include/thrust/system/detail/generic/memory.inl:30,
from /usr/local/cuda/include/thrust/system/detail/generic/memory.h:77,
from /usr/local/cuda/include/thrust/detail/reference.h:36,
from /home/xzhao/workdir/Open3D/build_debug/stdgpu/src/ext_stdgpu/src/stdgpu/../stdgpu/iterator.h:29,
from /home/xzhao/workdir/Open3D/build_debug/stdgpu/src/ext_stdgpu/src/stdgpu/impl/iterator.cpp:16:
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h: In function ‘uint32_t cuda::ptx::4::as_ptr_smem(const void*)’:
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h:40:44: error: ‘cvta_generic_to_shared’ was not declared in this scope
40 | return static_cast<_CUDA_VSTD::uint32_t>(__cvta_generic_to_shared(ptr));
| ^~~~~~~~
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h: In function ‘uint64_t cuda::ptx::4::as_ptr_gmem(const void*)’:
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h:60:44: error: ‘cvta_generic_to_global’ was not declared in this scope
60 | return static_cast<_CUDA_VSTD::uint64_t>(cvta_generic_to_global(ptr));
| ^~~~~~~~
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h: In function ‘_Tp cuda::ptx::4::from_ptr_smem(size_t)’:
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h:73:33: error: there are no arguments to ‘cvta_shared_to_generic’ that depend on a template parameter, so a declaration of ‘__cvta_shared_to_generic’ must be available [-fpermissive]
73 | return reinterpret_cast<_Tp>(cvta_shared_to_generic(ptr));
| ^~~~~~~~
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h:73:33: note: (if you use ‘-fpermissiv’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
/usr/local/cuda/include/cuda/std/detail/libcxx/include/__cuda/ptx/instructions/../ptx_helper_functions.h: In function ‘_Tp* cuda::ptx::4::from_ptr_gmem(size_t)’:
/usr/local/cuda/include/cuda/std/detail/libcxx/include/cuda/ptx/instructions/../ptx_helper_functions.h:94:33: error: there are no arguments to ‘cvta_global_to_generic’ that depend on a template parameter, so a declaration of ‘cvta_global_to_generic’ must be available [-fpermissive]
94 | return reinterpret_cast<_Tp>(cvta_global_to_generic(__ptr));
| ^~~~~~~~
In file included from /usr/local/cuda/include/thrust/system/cuda/detail/util.h:48,
from /usr/local/cuda/include/thrust/system/cuda/detail/malloc_and_free.h:34,
from /usr/local/cuda/include/thrust/system/detail/adl/malloc_and_free.h:50,
from /usr/local/cuda/include/thrust/system/detail/generic/memory.inl:30,
from /usr/local/cuda/include/thrust/system/detail/generic/memory.h:77,
from /usr/local/cuda/include/thrust/detail/reference.h:36,
from /home/xzhao/workdir/Open3D/build_debug/stdgpu/src/ext_stdgpu/src/stdgpu/../stdgpu/iterator.h:29,
from /home/xzhao/workdir/Open3D/build_debug/stdgpu/src/ext_stdgpu/src/stdgpu/impl/iterator.cpp:16:
/usr/local/cuda/include/cub/util_device.cuh: In static member function ‘static typename AgentT::TempStorage& cub::CUB200400CUDA_ARCH_LIST_NS::detail::vsmem_helper_impl~~~
/usr/local/cuda/include/cub/util_device.cuh: In static member function ‘static bool cub::CUB_200400_CUDA_ARCHLISTNS::detail::vsmem_helper_impl~~~~
/usr/local/cuda/include/cub/util_device.cuh:202:50: error: ‘blockDim’ was not declared in this scope
202 | const std::size_t block_stride = line_size ~~~
Describe the bug
Building VS project failed when the backend is CUDA 12.5.
Steps to reproduce
STDGPU_BACKEND
equaling toSTDGPU_BACKEND_CUDA
.Expected behavior
Building succeed.
Actual behavior
Building failed.
CMake configuration output:
VS building output:
System: