Closed raffenet closed 3 years ago
CC: @janjust
I think setting up and putting values into environment is ultimately the responsibility of process/resource manager. If different MPI implementation start to adding their own environment variables, for one, it may collide into name space and cause conflicts; and two, to utilize such environment, libraries or applications have to produce code that are MPI implementation dependent, counter to the effort of MPI-forum.
However, I think the cleanest and shorter solution is to have ucx or libfabric supply an api for setting local size and rank, and potentially other key-val attributes. Then other library including MPICH shall call this api to set the information directly. It is much cleaner solution for ucx or libfabric, and since MPICH is using ucx and libfabric, it is our responsibility to make sure we call these api appropriately -- likely at initiation time.
Ideologically, I agree. I think our chances of success in changing those libraries behavior would be greatly improved if we submitted issues/PRs supporting the proposed new APIs.
@wscullin also expressed interest in this feature, though I forget the context. Can you remind us? Not sure if it fits with this downstream library API viewpoint...
We also use this feature when debugging process placement in hwloc tools. Instead of displaying the binding of each PID, displaying the MPI rank in COMM_WORLD is more convenient. Not very important, but good to have.
It appears that the environment variable is set by mpirun
in the openMPI case. mpiexec.hydra
sets PMI_SIZE
and PMI_RANK
, which could be used in MPICH's case.
Summary:
OpenMPI sets OMPI_COMM_WORLD_LOCAL_SIZE
and OMPI_COMM_WORLD_LOCAL_RANK
Hydra sets PMI_SIZE
and PMI_RANK
I think that's sufficient.
Is PMI_SIZE
equivalent to OMPI_COMM_WORLD_SIZE
or to OMPI_COMM_WORLD_LOCAL_SIZE
? (same for _RANK
)
Is
PMI_SIZE
equivalent toOMPI_COMM_WORLD_SIZE
or toOMPI_COMM_WORLD_LOCAL_SIZE
? (same for_RANK
)
PMI_SIZE
should correspond to OMPI_COMM_WORLD_SIZE
.
There's been some requests to be able to access info about
MPI_COMM_WORLD
(# of ranks on node, process rank) through environment variables in order to allow other services/libraries to access if needed. One example below is the PSM provider in OFI getting some info from OpenMPI specific values.https://github.com/ofiwg/libfabric/blob/2a0f948b5f485b056e565ae85b98a3948a10ee3e/prov/psm/src/psmx_util.c#L314
We should add something similar in MPICH, probably during the
MPI_COMM_WORLD
creation stage.