Dependency on Kokkos - Githubissues

jdolence commented 2 years ago

This is an old debate, but a couple different things have made me wonder again recently if we should be a bit more rigorous in our abstraction of Kokkos. Over time, we've allowed raw Kokkos calls to show up in quite a few places, which means Kokkos is now a hard dependency. Here's my current list of raw Kokkos things that show up in the code base:

raw Kokkos::parallel_for and associated types, especially in bvals_cc_in_one.cpp
basically all of parthenon_arrays.hpp
Kokkos::View
Kokkos::subview
Kokkos::Profiling
Kokkos::create_mirror_view and Kokkos::create_mirror
Kokkos::deep_copy
Kokkos::fence
Kokkos::Timer
Kokkos::Sum, Kokkos::Max, and perhaps other reduction types
Kokkos::abort
stuff in tests

The list might get longer if we included what shows up in downstream codes because there isn't a parthenon::* alternative. As a project, I'd like to know what people think of this. Is this OK? Might this become problematic down the road? At this point, I don't think it's too late to provide parthenon:: versions of these things which would allow us, for example, to use an OpenMP backend without Kokkos, or to try out the parallel execution stuff in the standard. I don't want to get into the space of recreating Kokkos within parthenon, but I also worry about us getting backed into a corner.

I'm imagining something where we have a kokkos_abstractions.hpp that would have things like (for example)

template <typename Scalar, typename Space>
using Sum = Kokkos::Sum<Scalar, Space>;

but perhaps also a openmp_abstractions.hpp where one could define a Kokkos-free version of these things. Then a runtime_abstractions.hpp where one or the other is included at compile time.

So, anybody have any thoughts? I'll tag a few folks that I think might have an opinion. @pgrete @Yurlungur @brryan @maxpkatz @nmsriram @forrestglines . Others are obviously welcome to comment as well.

brryan commented 2 years ago

I'm in favor of the compatibility layer described above that makes Kokkos optional for three reasons. Kokkos does a great job with performance portability, of course, but (1) we aren't actually in charge of that project so if our interests diverge we could be locked out of new hardware. (2) One can imagine improving performance by tailoring a backend to our specific needs, like targeting vectorization through an openmp backend. (3) It seems like Kokkos is committed to moving to C++17 which may lead to an unusable code for the LANL people on some existing systems, when we need to move parthenon to C++17 to even support new versions of Kokkos that we need for other reasons.

This isn't a trivial amount of work but I think it is at least worth moving towards this goal unless others would have their workflows seriously damaged or broken by this change. Hopefully it would be a mostly unnoticeable change for those who just want to stay with Kokkos exclusively.

maxpkatz commented 2 years ago

Perhaps it's worth separating out things Kokkos provides that are fundamental to how Parthenon works and things which are incidental. It's good to have the parallel for and view abstractions be strict about this because that will make it a lot easier to understand in the short term whether use of the parallel algorithms in the C++ standard is an interesting long term choice. But for convenience functionality like Kokkos:max() and the profiling libraries, those are straightforward to replicate if you need to, but just add unnecessary overhead if it's likely that you'll never actually drop the Kokkos requirement.

pgrete commented 2 years ago

I'm in favor of keeping things as they are for now for the following reasons.

To my mind Kokkos is at the moment a rather weak dependency from the point of view that we don't use any advanced features (say graphs or atomics or LayoutTiled). Thus, the "simple" things should, in principle, be "simple" to replace, if we there's ever a need.
I'm generally not in favor of reinventing the wheel if there's no foreseeable need. Maybe I'm missing sth (or lack of some DOE internal information), but from my point of view the Kokkos mission/goal seems to be well aligned with what we generally need. If that changes at some point in the future (either from the Kokkos side or our side), we could still stick with the version at that date while working on a separate abstraction layer to make the transition.
If there's potential lack of support for some hardware of interest by Kokkos (as mentioned above), why add support for this to a Parthenon abstraction layer and not create a PR (or a patch) to Kokkos to support it.
Probably based on everyone's experience, if the abstractions are "simple" (see first item), the implementation will likely be way more involved and less simple than hoped for (i.e., requiring more resources in hours). On top of that, if we ever start using more complex features, which would also require replication in clean abstraction layer, this will likely be significantly more complex to replicate (as it would involve both the abstraction as well as the "backend" support).
If there's "spare time" to be spent of Parthenon development, we have a long list of things to further improve (with respect to performance, scaling, and functionality) that I consider higher priority than removing the dependency on Kokkos.
(Clean) downstream development may be slowed down as features, that are currently available in Kokkos (but not needed/used in Parthenon) would also require to add the appropriate abstractions to Parthenon.
If C++17 is really such as issue for LANL (which as far as I can tell is still "to be determined"), we could either stick with the final C++14 version of Kokkos until there's really some feature we need and only reintroduced in later version. Then (which, I guess, will first come up in years rather than months from now), we can reevaluate this discussion. I could imagine, that at that point C++17 may not be an issue any more.

Alright, this has been lots of text. My bottom line/key question that I'd like to be discussed/answered is: Is is there an actual need/pressing concern right now that would motivate this kind of change?

Note, that I'll be absent till Thanksgiving so I'll only get back to this discussion afterwards.

Yurlungur commented 2 years ago

My thinking on this is we're unlikely to drop Kokkos, but I think there's a desire to experiment with non-Kokkos implementations of, e.g., OpenMP or parallel algorithms in the C++ stdlib for parallel dispatch.

So I guess my question would be similar to @maxpkatz 's question. What bits can we play with safely in that context? Perhaps the only thing we need to change is parallel dispatch, which we've already abstracted?

parthenon-hpc-lab / parthenon

Dependency on Kokkos #615