Open daltonbohning opened 4 years ago
The main reason mfu_rank
is defined as a global is that it is used in the MFU_LOG
macro, and making it a global simplifies the call to that function so that the user doesn't have to explicitly add the rank as a parameter to every MFU_LOG
call.
Having said that, we have tried to keep to a discipline to call MPI_Comm_rank
everywhere to minimize use of global variables as much as possible while also not requiring the user to pass the rank as an input parameter in a bunch of mfu functions.
One of the long-term project goals is to enable applications to link and call mfu functions directly if they want to. To get there, we'd likely need to get rid of all globals and place that kind of state in a per-application context that would be allocated in mfu_init
or something. We'd also likely need to MPI_Comm_dup
whatever communicator the application provides, so that we wouldn't be using MPI_COMM_WORLD
. We'd probably need to then add a context structure as an input parameter to each user-facing call, at which point, we could bury a rank value as a field in that structure. All of that is clearly going to take some work, and we haven't had the driver for it yet.
Anyway, I agree that this is not consistent, and we could look to clean things up.
This answers my question perfectly!
we have tried to keep to a discipline to call MPI_Comm_rank everywhere to minimize use of global variables
I'll keep this mind! I didn't necessarily plan on making these sort of changes; I mostly wanted to make sure I was using rank
the way it is expected to be used.
Thank you!
We might also be able to call MPI_Comm_rank
within the MFU_LOG
macro directly and then perhaps drop the mfu_rank
global.
Do you know what the performance impact of the two different approaches are?
Of course, we wouldn't want to call MPI_Comm_rank
within a loop (and we don't), but I'm curious if there is a noticeable difference between a global vs instead using MPI_Comm_rank
in each function that needs rank
. But then, I think the idea of a context
would eventually be a better solution either way.
This discussion might be relevant: https://stackoverflow.com/questions/49872662/does-storing-mpi-rank-enhance-the-performance/49873583
For the implementation given, it doesn't seem that MPI_Comm_rank
has significant overhead.
I happened to glance over the implementation for
mfu_init
: https://github.com/hpc/mpifileutils/blob/66c5273035396c34998c32e9d7918bce9aeca58f/src/common/mfu_util.c#L202-L213I notice it sets
mfu_rank
, which is a global declared inmfu_util.h
. But the use ofmfu_rank
doesn't seem to be very widespread. In fact it seems that the code base utilizesMPI_Comm_rank
directly more often than it utilizesmfu_rank
, which seemingly would be available as long asmfu_init
was called.I haven't specifically checked on every occurrence of
MPI_Comm_rank
to see if the corresponding code could be replaced withmfu_rank
, but what is the intention here? I.e. is it intended that the code base should be usingmfu_rank
as much as possible? There are also a handful (at least) of functions that passrank
as a parameter, somfu_rank
could also simplify that.If anything, I would personally like to use
mfu_rank
in my own contributions to avoid propagating the inconsistencies, if that's what it is.@adammoody @gonsie Any comments, preferences here?