alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:
https://alpaka.readthedocs.io
Mozilla Public License 2.0
337 stars 69 forks source link

fix usage of Idx to alpaka::Idx #2265

Closed ichinii closed 1 month ago

ichinii commented 1 month ago

on windows, the use of Idx resolves differently than alpaka::Idx, thus alpaka::Idx should be used directly.

Compile log:

[build]   Compiling CUDA source file ..\..\..\..\example\parallelLoopPatterns\src\parallelLoopPatterns.cpp...
[build]   
[build]   C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -IC:\Users\ich\Desktop\hzb\forkalpaka\include -IC:\local\boost_1_84_0 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include"  -G   --keep-dir parallel.1B88E8CE\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --extended-lambda --expt-relaxed-constexpr --display-error-number -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D_USE_MATH_DEFINES -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -DALPAKA_BLOCK_SHARED_DYN_MEMBER_ALLOC_KIB=47 -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdparallelLoopPatterns.dir\Debug\vc143.pdb" -o parallelLoopPatterns.dir\Debug\parallelLoopPatterns.obj "C:\Users\ich\Desktop\hzb\forkalpaka\example\parallelLoopPatterns\src\parallelLoopPatterns.cpp" 
[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/BufUniformCudaHipRt.hpp(81): error : Idx is not a template [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns\parallelLoopPatterns.vcxproj]
[build]                     std::is_same_v<TIdx, Idx<TExtent>>,
[build]                                          ^
[build]             detected during:
[build]               instantiation of "alpaka::BufUniformCudaHipRt<TApi, TElem, TDim, TIdx>::BufUniformCudaHipRt(const alpaka::DevUniformCudaHipRt<TApi> &, TElem *, Deleter, const TExtent &, size_t) [with TApi=alpaka::ApiCudaRt, TElem=float, TDim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t, Deleter=lambda [](float *)->void]" at line 268
[build]               instantiation of "auto alpaka::trait::BufAlloc<TElem, Dim, TIdx, alpaka::DevUniformCudaHipRt<TApi>, void>::allocBuf(const alpaka::DevUniformCudaHipRt<TApi> &, const TExtent &)->alpaka::BufUniformCudaHipRt<TApi, TElem, Dim, TIdx> [with TApi=alpaka::ApiCudaRt, TElem=float, Dim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t]" at line 66 of C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/Traits.hpp
[build]   
[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/BufUniformCudaHipRt.hpp(80): error : static assertion failed with "The idx type of TExtent and the TIdx template parameter have to be identical!" [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns\parallelLoopPatterns.vcxproj]
[build]                 static_assert(
[build]                 ^
[build]             detected during:
[build]               instantiation of "alpaka::BufUniformCudaHipRt<TApi, TElem, TDim, TIdx>::BufUniformCudaHipRt(const alpaka::DevUniformCudaHipRt<TApi> &, TElem *, Deleter, const TExtent &, size_t) [with TApi=alpaka::ApiCudaRt, TElem=float, TDim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t, Deleter=lambda [](float *)->void]" at line 268
[build]               instantiation of "auto alpaka::trait::BufAlloc<TElem, Dim, TIdx, alpaka::DevUniformCudaHipRt<TApi>, void>::allocBuf(const alpaka::DevUniformCudaHipRt<TApi> &, const TExtent &)->alpaka::BufUniformCudaHipRt<TApi, TElem, Dim, TIdx> [with TApi=alpaka::ApiCudaRt, TElem=float, Dim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t]" at line 66 of C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/Traits.hpp
[build]   
[build]   2 errors detected in the compilation of "C:/Users/ich/Desktop/hzb/forkalpaka/example/parallelLoopPatterns/src/parallelLoopPatterns.cpp".
[build]   parallelLoopPatterns.cpp
[build] C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.4.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -IC:\Users\ich\Desktop\hzb\forkalpaka\include -IC:\local\boost_1_84_0 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include"  -G   --keep-dir parallel.1B88E8CE\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --extended-lambda --expt-relaxed-constexpr --display-error-number -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D_USE_MATH_DEFINES -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -DALPAKA_BLOCK_SHARED_DYN_MEMBER_ALLOC_KIB=47 -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdparallelLoopPatterns.dir\Debug\vc143.pdb" -o parallelLoopPatterns.dir\Debug\parallelLoopPatterns.obj "C:\Users\ich\Desktop\hzb\forkalpaka\example\parallelLoopPatterns\src\parallelLoopPatterns.cpp"" exited with code 1. [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns\parallelLoopPatterns.vcxproj]

Screenshot 2024-05-06 124329

edit: msvc: Microsoft (R) C/C++-Optimierungscompiler Version 19.39.33523 für x64 installed with Visual Studio 2022 VSCode: 1.89.0

fwyzard commented 1 month ago

What does the compiler resolve Idx to ?

mehmetyusufoglu commented 1 month ago

TExtent=uint32_t

Strange issue... In the idx/Traits.hpp file; the Idx template alias is defined before the specialisation. Namely; template<typename T> using Idx = typename trait::IdxType<T>::type; is defined before the trait::IdxType sepecialization for is_arithmetic case. Could taking the template alias after the specialisation might help?

psychocoderHPC commented 1 month ago

Could taking the template alias after the specialisation might help?

No this should not help, the type is forward declared, so all is fine. If a non arithmetic type is used it will fail with an error that there is no definition or incomplete type error.

psychocoderHPC commented 1 month ago

@ichinii Could you please add the nvcc and visual studio version you used where the issue showed up.

psychocoderHPC commented 1 month ago

Ohh wait @ichinii could you please try if including #include "alpaka/idx/Traits.hpp" in BufUniformCudaHipRt.hpp and ViewSubView.hpp instead of adding alpaka:: is solving the issue too? I think the problem is that unser linux we pull the Idx trait transitive and for reasons this is not the case in windows.

ichinii commented 1 month ago

Thanks for your quick responses. I got back to my windows machine today. Let me try to answer your questions.

What does the compiler resolve Idx to ?

Just declaring a variable with type Idx<TExtent> raises an error. It looks like it resolves to alpaka::internal::ViewAccessOps<TView>::Idx. Kind of strange that it resolves to ViewAccessOps<TView>::Idx even though we are talking BufUniformCudaHipRt here.

[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/BufUniformCudaHipRt.hpp(77): error : type "alpaka::internal::ViewAccessOps<TView>::Idx [with TView=alpaka::BufUniformCudaHipRt<alpaka::ApiCudaRt, int32_t, std::integral_constant<size_t, 1ULL>, uint64_t>]" (declared at line 45 of C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/view/ViewAccessOps.hpp) is inaccessible [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\test\unit\mem\copy\bufSlicingTest.vcxproj]
[build]                 Idx<TExtent> test;

@ichinii Could you please add the nvcc and visual studio version you used where the issue showed up.

msvc: Microsoft (R) C/C++-Optimierungscompiler Version 19.39.33523 für x64 installed with Visual Studio 2022 VSCode: 1.89.0

Ohh wait @ichinii could you please try if including #include "alpaka/idx/Traits.hpp" in BufUniformCudaHipRt.hpp and ViewSubView.hpp instead of adding alpaka:: is solving the issue too? I think the problem is that unser linux we pull the Idx trait transitive and for reasons this is not the case in windows.

I added the include to BufUniformCudaHipRt.hpp. It is already present in ViewSubView.hpp. Afai can tell it results in the same behaviour.

psychocoderHPC commented 1 month ago

thanks for fixing the issue!