LLNL / axom

CS infrastructure components for HPC applications
BSD 3-Clause "New" or "Revised" License
157 stars 27 forks source link

Compiler error when Axom is built with ENABLE_CUDA=OFF, but a dependent is built with CUDA on #986

Open adayton1 opened 1 year ago

adayton1 commented 1 year ago

I have ENABLE_CUDA=ON for CAMP, Umpire, and RAJA. I set ENABLE_CUDA=OFF in Axom, as well as in several other libraries that use Axom (mainly I need to turn CUDA off in spheral since there are build issues there). However, in our application, we do set ENABLE_CUDA=ON and I get a compiler error:

include/axom/core/ArrayBase.hpp(886): error: namespace "axom" has no member "CUDA_EXEC"

I believe there were also build errors coming from Axom headers when I enabled CUDA in Axom, but disabled CUDA in a library that depends on Axom.

It looks like there is a lot of inconsistency in which macros are used for determining when CUDA should be used (AXOM_USE_UMPIRE, AXOM_USE_RAJA, AXOM_USE_CUDA, AXOM_GPUCC). Perhaps the strategy for which should be used needs to be revisited.

publixsubfan commented 1 year ago

Hi Alan, I looked at ArrayBase.hpp and I believe switching from #if defined(AXOM_GPUCC) to #if defined(AXOM_USE_RAJA) && defined(AXOM_USE_CUDA) or defined(AXOM_USE_HIP) should fix your first issue with CUDA enabled for the application, but disabled for Axom.

With regard to the second case of CUDA enabled for Axom, but disabled for a dependent library, what are the errors you're seeing? I think with this case, we need to be careful not to introduce ABI incompatibilities as a result of differing defines between the Axom build environment and the dependent library's build environment. Thoughts @white238 ?

adayton1 commented 1 year ago

I attempted using #if defined(AXOM_USE_CUDA) to fix the case where CUDA is disabled in Axom but enabled for the application, but now I'm running into a different error:

include/axom/core/ArrayBase.hpp(1286): error: incomplete type is not allowed
          detected during:
            instantiation of "void axom::detail::ArrayOps<T, axom::MemorySpace::Dynamic>::destroy(T *, axom::IndexType, axom::IndexType, int) [with T=axom::IndexType]" 
include/axom/core/Array.hpp(1040): here
            instantiation of "void axom::Array<T, DIM, SPACE>::clear() [with T=axom::IndexType, DIM=1, SPACE=axom::MemorySpace::Dynamic]" 
include/axom/core/Array.hpp(1007): here
            instantiation of "axom::Array<T, DIM, SPACE>::~Array() [with T=axom::IndexType, DIM=1, SPACE=axom::MemorySpace::Dynamic]" 
include/axom/core/Array.hpp(810): here
            instantiation of "axom::Array<T, DIM, SPACE>::Array() [with T=axom::IndexType, DIM=1, SPACE=axom::MemorySpace::Dynamic]" 
include/axom/sidre/core/Array.hpp(377): here
            instantiation of "axom::sidre::Array<T, DIM>::Array(axom::sidre::View *) [with T=axom::IndexType, DIM=1, SFINAE=1, <unnamed>=(void *)nullptr]" 
/usr/tce/packages/gcc/gcc-8.3.1/rh/usr/lib/gcc/ppc64le-redhat-linux/8/../../../../include/c++/8/bits/unique_ptr.h(831): here
            instantiation of "std::_MakeUniq<_Tp>::__single_object std::make_unique<_Tp,_Args...>(_Args &&...) [with _Tp=axom::sidre::Array<axom::IndexType, 1>, _Args=<axom::sidre::View *&>]" 
include/axom/mint/mesh/internal/ConnectivityArrayHelpers.hpp(117): here
adayton1 commented 1 year ago

This is the error with CUDA enabled in Axom, but CUDA is off in a library that depends on Axom:

In file included from tribol/src/tribol/search/InterfacePairFinder.cpp:23:
In file included from include/axom/spin.hpp:12:
In file included from include/axom/spin/BVH.hpp:23:
In file included from include/axom/spin/policy/LinearBVH.hpp:21:
In file included from include/axom/spin/internal/linear_bvh/build_radix_tree.hpp:36:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/device_radix_sort.cuh:40:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/dispatch_radix_sort.cuh:40:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/agent_radix_sort_histogram.cuh:37:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/../block/block_load.cuh:39:
In file included from /usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/../block/../block/block_exchange.cuh:37:
/usr/tce/packages/cuda/cuda-11.7.0/include/cub/device/dispatch/../../agent/../block/../util_ptx.cuh:229:5: error: use of undeclared identifier '__syncthreads'
    __syncthreads();
    ^

Eventually the goal is to turn on CUDA in that library, but right now there are other issues with turning on CUDA in that library.

rhornung67 commented 1 year ago

@adayton1 Is your intent here to enable CUDA in RAJA, Umpire, etc. but disable CUDA in Axom and anything that depends on Axom? If so, then I think we want to make CUDA, HIP, etc. CMake dependent options in Axom. Then, you do ENABLE_CUDA=On for things in which you want CUDA enabled, and you do AXOM_ENABLE_CUDA=Off for Axom and things that depend on Axom. That is not currently supported in Axom, but it seems useful. We do that in RAJA, Umpire, etc. to support similar build scenarios.

@white238 @kennyweiss @davidbeckingsale what are your thoughts?

davidbeckingsale commented 1 year ago

That's a reasonable option. The other would be to carefully guard stuff in Axom such that a CUDA build of Axom can be included and used in a non-CUDA project - we do this in RAJA with a RAJA_CUDA_ACTIVE macro guard.

rhornung67 commented 1 year ago

It's not clear to me which path is easier to maintain. As macros and their use gets more intricate, it seems likely to increase confusion and incorrect usage.

davidbeckingsale commented 1 year ago

The CMake dependent option is definitely easier to maintain.

adayton1 commented 1 year ago

Just wanted to provide an update. Changing AXOM_GPUCC to AXOM_USE_CUDA in ArrayBase.hpp allowed me to turn CUDA off in Axom and still build my application with CUDA on.