ESCOMP / CTSM

Community Terrestrial Systems Model (includes the Community Land Model of CESM)
http://www.cesm.ucar.edu/models/cesm2.0/land/
Other
308 stars 312 forks source link

Out of the box pe layout for Derecho --compset 1850_DATM%GSWP3v1_CLM51%BGC-CROP --res ne30pg3_ne30pg3_mg17 is 50% slower than on cheyenne #2306

Open olyson opened 9 months ago

olyson commented 9 months ago

Maybe we can work on speeding up this configuration which is being used for standalone dead veg simulations, etc.

pe layout on cheyenne (I was getting around 172 yrs/day):

Comp NTASKS NTHRDS ROOTPE PSTRIDE CPL : 1800/ 1; 36 1 ATM : 36/ 1; 0 1 LND : 1800/ 1; 36 1 ICE : 1800/ 1; 36 1 OCN : 1800/ 1; 36 1 ROF : 1800/ 1; 36 1 GLC : 1800/ 1; 36 1 WAV : 1800/ 1; 36 1 ESP : 1/ 1; 0 1 ESMF_AWARE_THREADING is False ROOTPE is with respect to 36.0 tasks per node

pe layout on derecho (I'm getting around 86 yrs/day): Comp NTASKS NTHRDS ROOTPE PSTRIDE CPL : 640/ 1; 128 1 ATM : 128/ 1; 0 1 LND : 640/ 1; 128 1 ICE : 640/ 1; 128 1 OCN : 640/ 1; 128 1 ROF : 640/ 1; 128 1 GLC : 640/ 1; 128 1 WAV : 640/ 1; 128 1 ESP : 1/ 1; 0 1 ESMF_AWARE_THREADING is False ROOTPE is with respect to 128.0 tasks per node

ekluzek commented 9 months ago

This is part of #2244

@olyson could you try increasing to 14 nodes (so 640 tasks goes to 1792==14x128)? I'd like to make that the starting point to work from. I'll play around with it after you do that and report back.

Do you want this optimized for speed (throughput) or cost? Or a reasonable combination of the two?

Since this is the new CAM SE workhorse grid I take it this is an important PE layout to optimize....

olyson commented 9 months ago

I would think we want a reasonable combination of the two since it is the workhorse resolution. Note, the priority for this is not quite as high now, as we decided today that we are going to switch to 1deg for our dead veg simulations for ease/speed of postprocessing. But I think in general we'll be doing ne30 simulations more regularly.

5 nodes (default): Model Cost: 214.59 pe-hrs/simulated_year Model Throughput: 85.89 simulated_years/day

14 nodes: Model Cost: 329.17 pe-hrs/simulated_year Model Throughput: 139.99 simulated_years/day

ekluzek commented 9 months ago

@olyson says this is good for now, so we'll postpone further work for later.