Open cwpearson opened 3 months ago
Very good starting point.
I have a comment concerning the API. Here we have 3 "blocking" send
and only one "immediate".
It might make sense to have only one of each and a template parameter to choose between the implementation.
send(snd_buf, comm); // Alias for send<Default>(snd_buff, comm);
send<Ready>(snd_buf, comm);
send<Synchronous>(snd_buf, comm);
send<Buffered>(snd_buf, comm);
This way we can also implement the isend
variants. I think we should try to push users to I
variants as they are a lot easier to use to make a semantically sound code, and also be ready for more asynchronous computations.
That's definitely an option, but since the Immediate
version returns KokkosComm::Req
and the rest do not, you actually have to do something like:
enum class Mode {
Blocking,
Buffered,
Immediate,
Ready,
Synchronous
};
// most modes just return an error code
template <Mode mode>
struct SendReturn {
using type = int;
}
// Immediate return value includes an MPI_Request
struct IsendResult {
int err;
MPI_Request req;
};
template<> struct SendReturn<Mode::Immediate> {
using type = IsendResult;
}
template <Mode mode>
typename SendReturn<mode>::type send(...);
template<> typename SendReturn<Blocking>::type send<Blocking>(...) {...}
...
I think (though I might be missing something), this basically just replaces symbol lookup with template resolution in the compiler, and I'm not sure it really buys us anything
I see that I misunderstood a bit - you were just suggesting doing this for the synchronous versions. I think my point still stands (though we can avoid the funky return type template).
I think send
and isend
should be separate functions as they are semantically very different.
But between variants they are quite interchangeable.
The idea of the template parameter with a default value is to be able to change this value for ease of checking communication pattern correctness, like force synchronous send instead of send to prevent deadlocks that can be hidden by a high eager limit on the test platform.
So basically, in pseudo-code
enum class Mode {
Bare,
Blocking,
Ready,
Synchronous
};
#ifdef DEBUG_COMM
template<Mode mode=Synchronous>
#else
template<Mode mode=Bare>
#endif
int send(View, Comm);
#ifdef DEBUG_COMM
template<Mode mode=Synchronous>
#else
template<Mode mode=Bare>
#endif
Req isend(View, Comm);
Or we can go crazier and have a CommPolicy
object to set the default.
Maybe the thing to do right now while we consider this is put some relatively lean wrappers around MPI functions in the Impl
namespace, since we'll almost certainly need those anyway.
The motivation here is to prove we have a reasonable implementation by being able to run Trilinos on top of KokkosComm