Closed chengzhuzhang closed 3 years ago
Hmm, @chengzhuzhang, this seems way outside my wheelhouse. Is there someone from the E3SM team we could bring in for some help? I'm happy to change the build as needed but I don't know where to begin.
Just to be sure, this was run on a compute node, not a login node?
I should clarify that this ilamb run is from e3sm-unified on compute nodes.
There is not a ilamb-run executable in e3sm-unified on a login node. I'm wondering maybe @minxu74 could help take a look.
To reproduce, get a compute node:
srun -N 1 -t 01:00:00 --pty bash
source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh
export ILAMB_ROOT=/lcrc/group/e3sm/ac.zhang40/ilamb_data
ilamb-run --config /home/ac.zhang40/ILAMB/src/ILAMB/data/cmip.cfg --model_root /lcrc/group/e3sm/ac.zhang40/ilamb_test_data/ --regions global
@chengzhuzhang could you try the following command to see if it works?
srun -n 1 ilamb-run --config /home/ac.zhang40/ILAMB/src/ILAMB/data/cmip.cfg --model_root /lcrc/group/e3sm/ac.zhang40/ilamb_test_data/ --regions global bona
If it does not work, we may have to use system mpi4py, instead of the one installed by conda.
I do not have an account on LCRC machines, otherwise, I can look into it and try the above command.
Thanks @minxu74. Using system mpi4py
might be a possibility.
@xylar and @minxu74 Thank you for taking a look! When I try it today, it worked. Must be some one time glitch yesterday.
Okay, well that's a little disconcerting but hopefully it won't happen again. If it does, let's try to find out more. Using system mpi4py
, if there is such a thing, might be an option.
I'm trying to setup an ilamb run on LCRC. But a test run resulted an MPI_init error as follows:
ilamb-run --config /home/ac.zhang40/ILAMB/src/ILAMB/data/cmip.cfg --model_root /lcrc/group/e3sm/ac.zhang40/ilamb_test_data/ --regions global bona