BradWhitlock commented 1 year ago

I happened to have a non-RAJA serial Axom build I was using for debugging and ended up in it trying to run the quest_shaping_driver_ex utility to do some shaping in klee. This program gets built but it cannot run. The IntersectionShaper for example requires RAJA to do anything useful. I was going to change the CMake logic for quest_shaping_driver_ex to not build it unless RAJA is present. Then I recalled Kenny saying something to the effect of it should be possible to build Axom without RAJA. Most of the code reflects this (e.g. use of axom::for_all rather than RAJA). I took a look at IntersectionShaper and there are cases for using SEQ_EXEC as the execution policy but they are currently guarded with RAJA ifdefs because some RAJA reductions and atomics are used.

I made some exploratory edits to change calls like RAJA::ReduceSum to axom::ReduceSum and provided some non-RAJA definitions that produced a working quest_shaping_driver_ex program using the IntersectionShaper (albeit really slow). Is there any value to hardening such work and migrating some additional RAJA:: constructs to axom:: constructs? I expect these definitions could live down in axom/src/axom/core/execution/internal.

Change RAJA:: calls in IntersectionShaper to axom::
Make suitable serial, (RAJA omp, cuda, hip) wrappers for things like ReduceSum, atomicAdd,...
Make sure quest_shaping_driver_ex can operate in non-RAJA build

`

if !defined(AXOM_USE_RAJA)

template <typename ReducePolicy, typename T> class ReduceSum { public: ReduceSum(T value) : sum(std::make_shared(value)) { }

void operator += (T value) const { T ptr = sum.get(); ptr += value; }

T get() const { return *sum; }

operator int() const { return static_cast(sum); } operator double() const { return static_cast(sum); } private: std::shared_ptr sum; };

template <typename AtomicPolicy, typename Precision> Precision atomicAdd(Precision acc, Precision value) { Precision ret = acc; *acc += value; return ret; }

else

template <typename ReducePolicy, typename T> using ReduceSum = RAJA::ReduceSum<ReducePolicy, T>;

template <typename AtomicPolicy, typename Precision> Precision atomicAdd(Precision *acc, Precision value) { return RAJA::atomicAdd(acc, value); }

endif

`

rhornung67 commented 1 year ago

@BradWhitlock I think having everything in Axom work for serial execution without RAJA enabled would be useful. However, also think that for any parallel execution, we should require RAJA and use it for that. Wrapping RAJA stuff in the Axom namespace would make code maintenance easier, I think, because it would eliminate most of the need for macro guards in the code; i.e., they could be localized.

Of course, others on the team should weigh in on these decisions. This would be a good discussion topic for the next Axom meeting. Agree?

kennyweiss commented 1 year ago

Thanks @BradWhitlock -- I agree with @rhornung67

Adding a non-RAJA path for serial execution might also make it easier to debug the code in some cases.

rhornung67 commented 1 year ago

Another point to consider, the internal implementations of things in RAJA, like reductions, are very complex for GPU back-ends in particular. They have been highly tuned based on compiler, back-end programming model, memory usage, and other issues. It would not be wise to try to develop and maintain Axom versions of such things, IMO.

Also, RAJA has a new reduction interface (available in v2022.10.x releases) that is more flexible and offers performance benefits that the original reduction model based on lambda capture of a reducer cannot reproduce. We should consider switching over to it in Axom in the future. Here's some basic documentation on it. It will be expanded in the future, including more examples.... https://raja.readthedocs.io/en/develop/sphinx/user_guide/feature/reduction.html#experimental-reduction-interface

BradWhitlock commented 1 year ago

I might have mis-stated above but what I was proposing was a set of non-RAJA serial constructs and then RAJA for everything if it is enabled.

cyrush commented 1 year ago

This is a good idea.

I just executed a similar exercise in Ascent mocking - a small subset of RAJA for use in serial when RAJA isn't around.

In case it helps, here are the RAJA-based and moc'ed functions:

https://github.com/Alpine-DAV/ascent/blob/develop/src/libs/ascent/runtimes/expressions/ascent_execution_policies.hpp

It has some extra reductions that might come up.

kennyweiss commented 1 year ago

Another idea would be to disable the shaping application in configurations without RAJA

LLNL / axom

Add serial non-RAJA support for shapers in klee. #973

if !defined(AXOM_USE_RAJA)

else

endif