Closed xylar closed 5 years ago
@jhkennedy, I’ve been in discussion with Min about a workaround but worth discussing between the two of us, too.
I'm following these instructions at NERSC: https://docs.nersc.gov/programming/high-level-environments/python/mpi4py/#mpi4py-in-your-custom-conda-environment
I also tested on compy with:
module load gcc/4.8.5
module load mvapich2/2.3.1
The resulting environment with mpi4py
appears to work for calls to mpirun
but not to srun
. @rljacob, do you know if/under what conditions srun
works on compy
?
I haven't tested yet, but I think the system MPI and the conda installation of MPICH aren't going to play nice with one another. The esmf
package depends on mpich
and is a dependency of nco
so we're going to be in a bit of trouble. Things might work at NERSC via srun
(which uses system MPI) for mpi4py
calls and mpirun
(which will use the conda mpich
) for esmf
calls. But, given that srun
didn't work for me on compy
, we might be in trouble there.
I believe I have a solution. Testing will be needed.
I build a serial version of esmf
that I will upload to the e3sm
anaconda channel once I've tested.
I have updated the build script to include building mpi4py with native MPI on cori
, compy
, anvil
, cooley
and grizzly
. I won't do this on rhea
or acme1
unless it is requested.
So the solution I came up with doesn't seem to work for esmpy
(and therefore maybe not for many of our packages, thought e3sm_diags
is the only one where I'm sure). I'll explore more for the next release...
mpi4py (used by ilamb) doesn’t work properly on cori and is not likely performant on other HPC.