Open adayton1 opened 1 year ago
Hi Alan, I looked at ArrayBase.hpp and I believe switching from #if defined(AXOM_GPUCC)
to #if defined(AXOM_USE_RAJA) && defined(AXOM_USE_CUDA)
or defined(AXOM_USE_HIP)
should fix your first issue with CUDA enabled for the application, but disabled for Axom.
With regard to the second case of CUDA enabled for Axom, but disabled for a dependent library, what are the errors you're seeing? I think with this case, we need to be careful not to introduce ABI incompatibilities as a result of differing defines between the Axom build environment and the dependent library's build environment. Thoughts @white238 ?
I attempted using #if defined(AXOM_USE_CUDA)
to fix the case where CUDA is disabled in Axom but enabled for the application, but now I'm running into a different error:
include/axom/core/ArrayBase.hpp(1286): error: incomplete type is not allowed
detected during:
instantiation of "void axom::detail::ArrayOps<T, axom::MemorySpace::Dynamic>::destroy(T *, axom::IndexType, axom::IndexType, int) [with T=axom::IndexType]"
include/axom/core/Array.hpp(1040): here
instantiation of "void axom::Array<T, DIM, SPACE>::clear() [with T=axom::IndexType, DIM=1, SPACE=axom::MemorySpace::Dynamic]"
include/axom/core/Array.hpp(1007): here
instantiation of "axom::Array<T, DIM, SPACE>::~Array() [with T=axom::IndexType, DIM=1, SPACE=axom::MemorySpace::Dynamic]"
include/axom/core/Array.hpp(810): here
instantiation of "axom::Array<T, DIM, SPACE>::Array() [with T=axom::IndexType, DIM=1, SPACE=axom::MemorySpace::Dynamic]"
include/axom/sidre/core/Array.hpp(377): here
instantiation of "axom::sidre::Array<T, DIM>::Array(axom::sidre::View *) [with T=axom::IndexType, DIM=1, SFINAE=1, <unnamed>=(void *)nullptr]"
/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8/../../../../include/c++/8/bits/unique_ptr.h(831): here
instantiation of "std::_MakeUniq<_Tp>::__single_object std::make_unique<_Tp,_Args...>(_Args &&...) [with _Tp=axom::sidre::Array<axom::IndexType, 1>, _Args=<axom::sidre::View *&>]"
include/axom/mint/mesh/internal/ConnectivityArrayHelpers.hpp(117): here
This is the error with CUDA enabled in Axom, but CUDA is off in a library that depends on Axom:
In file included from tribol/src/tribol/search/InterfacePairFinder.cpp:23:
In file included from include/axom/spin.hpp:12:
In file included from include/axom/spin/BVH.hpp:23:
In file included from include/axom/spin/policy/LinearBVH.hpp:21:
In file included from include/axom/spin/internal/linear_bvh/build_radix_tree.hpp:36:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/device_radix_sort.cuh:40:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/dispatch_radix_sort.cuh:40:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/agent_radix_sort_histogram.cuh:37:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/../block/block_load.cuh:39:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/../block/../block/block_exchange.cuh:37:
/usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/../block/../util_ptx.cuh:229:5: error: use of undeclared identifier '__syncthreads'
__syncthreads();
^
Eventually the goal is to turn on CUDA in that library, but right now there are other issues with turning on CUDA in that library.
@adayton1 Is your intent here to enable CUDA in RAJA, Umpire, etc. but disable CUDA in Axom and anything that depends on Axom? If so, then I think we want to make CUDA, HIP, etc. CMake dependent options in Axom. Then, you do ENABLE_CUDA=On for things in which you want CUDA enabled, and you do AXOM_ENABLE_CUDA=Off for Axom and things that depend on Axom. That is not currently supported in Axom, but it seems useful. We do that in RAJA, Umpire, etc. to support similar build scenarios.
@white238 @kennyweiss @davidbeckingsale what are your thoughts?
That's a reasonable option. The other would be to carefully guard stuff in Axom such that a CUDA build of Axom can be included and used in a non-CUDA project - we do this in RAJA with a RAJA_CUDA_ACTIVE
macro guard.
It's not clear to me which path is easier to maintain. As macros and their use gets more intricate, it seems likely to increase confusion and incorrect usage.
The CMake dependent option is definitely easier to maintain.
Just wanted to provide an update. Changing AXOM_GPUCC
to AXOM_USE_CUDA
in ArrayBase.hpp allowed me to turn CUDA off in Axom and still build my application with CUDA on.
I have ENABLE_CUDA=ON for CAMP, Umpire, and RAJA. I set ENABLE_CUDA=OFF in Axom, as well as in several other libraries that use Axom (mainly I need to turn CUDA off in spheral since there are build issues there). However, in our application, we do set ENABLE_CUDA=ON and I get a compiler error:
I believe there were also build errors coming from Axom headers when I enabled CUDA in Axom, but disabled CUDA in a library that depends on Axom.
It looks like there is a lot of inconsistency in which macros are used for determining when CUDA should be used (AXOM_USE_UMPIRE, AXOM_USE_RAJA, AXOM_USE_CUDA, AXOM_GPUCC). Perhaps the strategy for which should be used needs to be revisited.