LLNL / camp

Compiler agnostic metaprogramming library providing concepts, type operations and tuples for C++ and cuda
Other
78 stars 20 forks source link

Excessive Recursion Error with RAJA #91

Open rchen20 opened 2 years ago

rchen20 commented 2 years ago

When there are too many combinations of templated test cases in RAJA, there is an excessive recursion error with gcc/8.3.1 + cuda/10.1.243. Occurs when there are 3 3 3 * 2 combinations of template parameters. https://github.com/LLNL/RAJA/pull/1165

Make verbose line - cd /usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/build_lc_blueos-nvcc10.1.243-sm_70-gcc8.3.1/test/unit/workgroup && /usr/tce/packages/cuda/cuda-10.1.243/bin/nvcc -ccbin=/usr/tce/packages/gcc/gcc-8.3.1/bin/g++ -DGTEST_HAS_DEATH_TEST=1 -I/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/test/include -I/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/test/unit/workgroup/tests -I/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/blt/thirdparty_builtin/googletest-master-2020-01-07/googletest -I/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/include -I/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/build_lc_blueos-nvcc10.1.243-sm_70-gcc8.3.1/include -I/usr/tce/packages/cuda/cuda-10.1.243/include -I/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/tpl/camp/include -isystem=/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/blt/thirdparty_builtin/googletest-master-2020-01-07/googletest/include -isystem=/usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/tpl/cub -Xcompiler -mno-float128 -restrict -arch sm_70 --expt-extended-lambda --expt-relaxed-constexpr -Xcudafe "--display_error_number" -O3 -Xcompiler -O3 -Xcompiler -finline-functions -Xcompiler -fopenmp -Xcompiler=-fPIE -Xcompiler=-fopenmp -std=c++14 -x cu -c /usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/build_lc_blueos-nvcc10.1.243-sm_70-gcc8.3.1/test/unit/workgroup/test-workgroup-Enqueue-Multiple-Cuda.cpp -o CMakeFiles/test-workgroup-Enqueue-Multiple-Cuda.exe.dir/test-workgroup-Enqueue-Multiple-Cuda.cpp.o

Error output - /usr/WS1/chen59/allraja/rajaminblocks/raja_git_minblocks/tpl/camp/include/camp/camp.hpp(104): error #456: excessive recursion at instantiation of class "camp::join<camp::list<camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, int, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, int, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, int, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, int, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, int, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, int, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::ragged_array_of_objects, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, int, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, int, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, int, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::ordered, RAJA::policy::workgroup::constant_stride_array_of_objects, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, int, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, int, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, int, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<int, int >, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::array_of_pointers, long, RAJA::xargs<>, detail::ResourceAllocator::std_allocator>, camp::list<RAJA::cuda_work<256UL, false>, RAJA::policy::workgroup::reverse_ordered, RAJA::policy::workgroup::

rhornung67 commented 2 years ago

"Sometimes"? Do we have any idea why this would not happen everytime? I have not seen this.

rchen20 commented 2 years ago

"Sometimes"? Do we have any idea why this would not happen everytime? I have not seen this.

Sorry, that's a typo, it happens all the time.

trws commented 2 years ago

Probably depends on which compiler at least. gcc usually has a much deeper template recursion limit than clang, but it looks like this one is landing on the straightforward recursive implementation of camp::join. Should be relatively easy to fix.