Issues to run ESM4 - Githubissues

JFA-Mbule commented 2 years ago

Hi, am I again, Jaime. I managed to compile the model on our machine, as I mentioned in the previous issue. Now, I'm struggling to run it. I'm finding some problems to run the model with the default settings (existing) in our machine (to do the first tests to run).

The amount of nodes/cores available to run the model is large and It no all the time these quantities are available to run the model with the configuration presented on our machine. That is, running with more than 3100 colors (actually 3165, atmos_npes = 1728, 1728 core for atmospheric model, and ocean_npes = 1437, 1437 for ocean), as configured in the namelist (in floder ./ESM4_rundir/input.nml) and run script ( floder ./run/) of the model.

In the partition I have access to, each node has 48 cores, in fact, there are those cores quantities (there are about 90 nodes in the partition that I have access to), but I'm trying to test with just a few nodes (about 10 just to see how the model will behave). It seems to me, that 66 of 90 nodes, is a sufficient number of nodes to run the model with these default settings, however, the logistics of this are quite difficult. For that reason, I'm trying fewer nodes just as a test to see if the model will run and how it will behave. However, when I'm testing with a configuration of 10 knots or even less (I've tested less and more than 10, in this case, 30) the round breaks before it even starts. Could someone help me understand why?

Below is the bash script used to run the model:

#!/bin/bash
#SBATCH --nodes=10 #Number of Nodes ## if use cptec, 30 is maxima, for cptec_long partition, 90 is the maxima (but this can be available). 
#SBATCH --ntasks-per-node=48 #Number of tasks per node
#SBATCH --ntasks=480 ##1440 ##1728##3168 #Total number of MPI tasks
##SBATCH --cpus-per-task=6 #Number of threads per MPI task
#SBATCH --job-name=runTest-ESM4-gfdl #Job name
#SBATCH --mail-user=ja...@gmail.com ##ja...@inpe.br
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --time=1-00:00:00 #Change the timeout to stop (after 20 days)
#SBATCH -p cptec #Queue to be used ## can change to "het_scal or cptec_long" node limit for cptec queue is 30 and cptec_long is 90
#SBATCH --exclusive #Exclusive use of nodes during job execution

##Display the nodes allocated to the Job
echo $SLURM_JOB_NODELIST
nodeset -e $SLURM_JOB_NODELIST

cd $SLURM_SUBMIT_DIR

#Configure the compilers
##-------------------------#

## 1) Using OpenMPI with Intel PSXE (2016, "2017", 2018 or 2019)
source /scratch/app/modulos/intel-psxe-2017.sh
#Clear module cache
module purge
module load openmpi/icc/2.0.4.2
#****************************************************** *************************
## GNU Lesser General Public License
#****************************************************** *************************

############################################## ##########
## ############## USER INPUT SECTION ################# ##
############################################## ##########
## directory where the model will run
## This directory should contain the input.nml file, *_table, INPUT, and RESTART folders.
workDir=/prj/cptec/ja../Models/ESM4_Model/ESM4/ESM4_rundir
##workDir=../ESM4_rundir
#Path to executable
#Path to executable
##executable=${PWD}../exec/esm4.1.x
executable=/prj/cptec/ja../Models/ESM4_Model/ESM4/exec/esm4.1.x
#MPI run program (srun, mpirun, mpiexec, aprun ...)
run_prog=srun
#Option to specify number of colors
ncores=-n
#Option to specify number of nodes
nnodes=-N
#Option to specify number of threads
nthreads=-c
## Set up for run, these are the default values set in the input.nml file
#Number of colors to run the atmosphere
atm_cores=288 ##960 ##1056 ##1728
#Number of threads to use for the atmosphere
atm_threads=4
#Number of nodes to use for the atmosphere
atm_ncores_per_node=6
#Number of colors to run the ocean
#Number of colors to run the ocean
ocn_cores=192 ##480 ##1440 ##original 1437, it's was changed also in SM4_rundir/input.nml, line with &coupler_nml, ocean_npes = 1437
## Add any additional options here that you need for your
ocn_ncores_per_node=4 ##10
ocn_threads=2

############################################## ##########
## ############# END USER INPUT SECTION ############### ##
############################################## ##########
## Set environment variables
export KMP_STACKSIZE=512m
export NC_BLKSZ=1M
export F_UFMTENDIAN=big
##export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

## Set the stacksize to unlimited
unlimited unlimited
ulimit -S -s unlimited
ulimit -S -c unlimited

**## Go to the workDir
cd ${workDir}**

## Add new
##echo "================================="
##echo "= NTASKS = $SLURM_NTASKS ="
##echo "= NNODES = "$SLURM_NNODES" ="
##echo "= CPUS_PER_TASK ="$SLURM_CPUS_PER_TASK" ="
##echo "================================="
##echo "= nodelist: " $SLURM_JOB_NODELIST" ="
##echo "================================="
## Run the model in the workDir

**$*{run_prog}* $*{nnodes}* $*{atm_ncores_per_node}* $*{ncores}* $*{atm_cores}* $*{nthreads}* $*{atm_threads}* $*{executable}* : $*{nnodes}* $*{ocn_ncores_per_node}* $*{ncores}* $*{ocn_cores}* ${executable}** |& tee stdout.log

Please, looking at these settings, can anyone see any errors that are causing the test runs to break?

Another thing, when I change the default values (atm_cores=1728 and ocn_cores=1437), should I also change the values in the namelist (in ./ESM4_rundir/input.nml)? I made this change. I don't know if this is what is causing the crash.

Thanks.

thomas-robinson commented 2 years ago

Without knowing the error, it's difficult to know what went wrong. There are several things though that I can guess

When you change the cores, you have to change the layouts in input.nml. The numbers in the layout multiply to give the number of cores. for atmosphere and land layout = x,y where x*y*6=atmos_ncores and for ice and ocean layout = x,y where x*y=ocean_ncores.
Yes, you have to update the namelists to have the correct number of cores and threads for ocean and atmosphere
Don't run the ocean with threads
If you change the number of cores in the ocean, there are other configuration files that need to be changed, and I don't know how to change them. When I run this ESM4, I just run it with the default configuration.

JFA-Mbule commented 2 years ago

Hi Thomas, thank you for your quick answer!

Initially, the error was about the "Partition Nodes Limit", that is,

*** JOB 10594601 CANCELLED AT 2022-07-31T16:00 DUE TO (PartitionNodeLimit) => Requested 66 Nodes on the Partition cptec, which have limits of  ***

when I try to use the default configuration. And when I change this to 30 nodes or less, the run only break. So, when I make " sacct -j 10597390 --format=Jobname,partition,time,start,end,nnodes,state,nodelist,ncpus " command, I get these answers:

For 30 nodes:

runTest-E+      cptec 1-00:00:00 2022-08-04T08:01:12 2022-08-04T08:01:13       30     FAILED sdumont[6201,6+       1440 
     batch                       2022-08-04T08:01:12 2022-08-04T08:01:13        1     FAILED     sdumont6201         48

For 10 nodes:

runTest-E+      cptec 1-00:00:00 2022-08-04T00:10:13 2022-08-04T00:10:16       10     FAILED sdumont[6266-6+        480 
     batch                       2022-08-04T00:10:13 2022-08-04T00:10:16        1     FAILED     sdumont6266         48

For only 1 nodes:

runTest-E+      cptec 1-00:00:00 2022-08-04T00:25:10 2022-08-04T00:25:12        1     FAILED     sdumont6240         48 
     batch                       2022-08-04T00:25:10 2022-08-04T00:25:12        1     FAILED     sdumont6240         48

However, seeing in your answer, maybe it's for forgetting to change the ocean_ncores in another list of names besides the one in ./ESM4_rundir/input.nml. I try to find the other namelist and change it to correct ncores.

Thomas, in the 1 point of your answer, y and x are the number of nodes (nnode, --nodes=y) and the number of cores per node (ncore_node, --ntasks-per-node=x), or the inverse, respectivily?!

thomas-robinson commented 2 years ago

in the input.nml file, there are variables called layout. The layout is an array of two integers. The integers are related to the number of cores, not nodes.

Again, the best/easiest strategy for running the model is to run it with the prescribed number of cores. It's difficult to change the number of cores, especially for the ocean.

I don't really understand the information you are showing me. It looks like you are asking me how many nodes you need, and I have never worked with your specific computer. I think you need help from someone local to figure out how to get the model running and how many nodes you need.

JFA-Mbule commented 2 years ago

Ok, I'll go back to the default settings. So, what the x,y, and 6 numbers mean?

in the input.nml file there is &fv_core_nml layout = 12,24 in this case, is 12 the number of cores for the atmosphere and 24 for the ocean?

thomas-robinson commented 2 years ago

The fv_core_nml is referring to the atmosphere only.
layout = 12,24 means that you have 12*24*6 = 1728 cores for the atmosphere. You multiply by 6 because the atmosphere is using the cubed-sphere, so there are 6 faces of the sphere. The layout will be the same for the land.
For Ice and Ocean, you will have a different layout. For the ocean and ice layout, you multiply the numbers together to get the number of cores. These are not on the cubed-sphere, so you don't multiply by 6. There should be an atmos_pes and ocean_pes that show you the number of cores for the atmosphere and ocean.

JFA-Mbule commented 2 years ago

Oh, Thank you for your answer, Thomas!

Yes, there are atmos_pes and ocean_pes, which are 1728 and 1437, respectively.

I'm trying some slurm configurations, using the standard ESM4 cores configurations. If it runs, I will tell you.

NOAA-GFDL / ESM4

Issues to run ESM4 #11