geoschem / GCHP

The "superproject" wrapper repository for GCHP, the high-performance instance of the GEOS-Chem chemical-transport model.
https://gchp.readthedocs.io
Other
23 stars 26 forks source link

[QUESTION] about GCHPchem timing interpretation #194

Closed jimmielin closed 2 years ago

jimmielin commented 2 years ago

Ask a question about GCHP:

Hi all,

I'm running some timing tests on GCHP and I'm trying to understand what the timer outputs from GCHP output mean. I have some output from the end of the run (see end of this issue description) which gives times for GCHPchem and a breakdown of individual components.

In the breakdown in the end, there's a line saying

                                                               Inclusive        Exclusive    
                                                            ================ ================
Name                                               #-cycles  T (sec)    %     T (sec)    %   
                                                   -------- --------- ------ --------- ------
--Run
------GCHPchem                                         4032  2242.760  41.41  2242.760  41.41

which I understand is the wall clock time for the Run routine in the GCHPchem GridComp, which is 2242.760. However, in the breakdown for Times for GCHPchem, there's another breakdown:

TOTAL                   :    1772.221    2300.730    2830.601
INITIALIZE              :      60.922      66.542      71.178
RUN                     :    1683.871    2214.322    2743.114
FINALIZE                :       0.000       0.000       0.000

None of these numbers correspond to 2242.760, nor any combination of the other numbers (0.171 for SetService -- GCHPchem, 66.574 for Initialize -- GCHPchem, and 3.238 for Finalize -- GCHPchem).

My question is:

Thank you!

Attached: Excerpt from the end-of-run timer output from GCHP 13.3

 Time for GCHPchem
                                                               Inclusive        Exclusive    
                                                            ================ ================
Name                                               #-cycles  T (sec)    %     T (sec)    %   
                                                   -------- --------- ------ --------- ------
GCHPchem                                               4035  2311.197 100.00     0.010   0.00
--SetService                                              1     0.171   0.01     0.171   0.01
----GenSetService                                         1     0.000   0.00     0.000   0.00
--Initialize                                              1    66.574   2.88    61.576   2.66
----GenInitialize                                         1     4.998   0.22     0.000   0.00
------GenInitialize_self                                  2     4.998   0.22     4.998   0.22
--Record                                               2016     0.012   0.00     0.012   0.00
----Record_self                                        2016     0.001   0.00     0.001   0.00
--Run                                                  2016  2241.194  96.97  2241.194  96.97
--Finalize                                                1     3.236   0.14     0.010   0.00
----Final_self                                            1     3.226   0.14     3.226   0.14

  Times for GCHPchem
TOTAL                   :    1772.221    2300.730    2830.601
INITIALIZE              :      60.922      66.542      71.178
RUN                     :    1683.871    2214.322    2743.114
FINALIZE                :       0.000       0.000       0.000
DO_CHEM                 :    1583.555    2102.432    2621.411
CP_BFRE                 :       0.965       1.151       1.372
CP_AFTR                 :      29.527      35.491      39.928
GC_CONV                 :     137.218     181.835     229.048
GC_EMIS                 :      18.575      19.281      20.846
GC_DRYDEP               :       3.093       8.018      21.844
GC_FLUXES               :       0.000       0.000       0.000
GC_TURB                 :     159.575     197.583     208.508
GC_CHEM                 :    1037.151    1473.979    2002.132
GC_WETDEP               :     107.126     139.087     161.337
GC_DIAGN                :      23.866      30.681      35.868
GC_RAD                  :       0.000       0.000       0.000
GenInitTot              :       4.929       4.943       4.998
--GenInitMine           :       4.929       4.943       4.998
GenRunTot               :    1697.182    2226.469    2755.952
--GenRunMine            :    1697.180    2226.467    2755.948
GenFinalTot             :       2.562       2.882       3.226
--GenFinalMine          :       2.562       2.882       3.226
GenRecordTot            :       0.004       0.006       0.007
--GenRecordMine         :       0.002       0.003       0.004
GenRefreshTot           :       0.000       0.000       0.000
--GenRefreshMine        :       0.000       0.000       0.000

... truncated here ...

                                                               Inclusive        Exclusive    
                                                            ================ ================
Name                                               #-cycles  T (sec)    %     T (sec)    %   
                                                   -------- --------- ------ --------- ------
All                                                       1  5415.476 100.00     0.105   0.00
--SetService                                              1     0.449   0.01     0.042   0.00
----GCHP                                                  1     0.406   0.01     0.155   0.00
------GCHPctmEnv                                          1     0.001   0.00     0.001   0.00
------GCHPchem                                            1     0.171   0.00     0.171   0.00
------DYNAMICS                                            1     0.079   0.00     0.079   0.00
----HIST                                                  1     0.001   0.00     0.001   0.00
----EXTDATA                                               1     0.001   0.00     0.001   0.00
--Initialize                                              1    89.230   1.65     0.872   0.02
----GCHP                                                  1    72.903   1.35     1.680   0.03
------GCHPctmEnv                                          1     0.019   0.00     0.019   0.00
------GCHPchem                                            1    66.574   1.23    66.574   1.23
------DYNAMICS                                            1     4.630   0.09     4.630   0.09
----HIST                                                  1     2.690   0.05     2.690   0.05
----EXTDATA                                               1    12.764   0.24    12.764   0.24
--Run                                                     1  5322.379  98.28  1572.164  29.03
----EXTDATA                                            2016   391.262   7.22   391.262   7.22
----GCHP                                               4032  3193.313  58.97     3.800   0.07
------GCHPctmEnv                                       4032    20.544   0.38    20.544   0.38
------GCHPchem                                         4032  2242.760  41.41  2242.760  41.41
------DYNAMICS                                         4032   926.209  17.10   926.209  17.10
----HIST                                               4032   165.639   3.06   165.639   3.06
--Finalize                                                1     3.313   0.06     0.018   0.00
----GCHP                                                  1     3.288   0.06     0.010   0.00
------GCHPctmEnv                                          1     0.036   0.00     0.036   0.00
------GCHPchem                                            1     3.238   0.06     3.238   0.06
------DYNAMICS                                            1     0.004   0.00     0.004   0.00
----HIST                                                  1     0.004   0.00     0.004   0.00
----EXTDATA                                               1     0.003   0.00     0.003   0.00
lizziel commented 2 years ago

The timing output written by MAPL is indeed confusing. The three columns in the "Times for GCHPchem" from left to right correspond to minimum, mean, and maximum processor time. "GCHPchem" corresponds to the entire gridded component for GEOS-Chem, so includes other components besides chemistry, e.g. convection. To assess timing for a specific part of GEOS-Chem you should use the timers that start with prefix GC_. Toggling those timers on and off are mostly in geos-chem/Interfaces/GCHP/gchp_chunk_mod.F90, but also in geos-chem/Interfaces/GCHP/Chem_GridCompMod.F90, using MAPL subroutine MAPL_TimerOn and MAPL_TimerOff. When in doubt about what a timer is measuring it is best to check the source code to see what calls it is wrapping. For example:

    IF ( DoConv ) THEN
       if(Input_Opt%AmIRoot.and.NCALLS<10) write(*,*) ' --- Do convection now'
       CALL MAPL_TimerOn( STATE, 'GC_CONV' )

       CALL DO_CONVECTION ( Input_Opt, State_Chm, State_Diag, &
                            State_Grid, State_Met, RC )
       _ASSERT(RC==GC_SUCCESS, 'Error calling DO_CONVECTION')

       CALL MAPL_TimerOff( STATE, 'GC_CONV' )
       if(Input_Opt%AmIRoot.and.NCALLS<10) write(*,*) ' --- Convection done!'
    ENDIF

For chemistry you should look at timer GC_CHEM which includes the calls to compute overhead ozone, set H2O, and Do_Chemistry. Whether to use min, mean, or max is a judgement call.

Regarding the rest of the timing info, I am not exactly sure why the numbers do not match up, although I would expect it is because the various sub-timers do not break up the run in exact ways. @LiamBindle, do you remember if Tom explained this? Do you also remember the precise definitions for inclusive versus exclusive?

LiamBindle commented 2 years ago

Inclusive refers to the time spent in that function, including the time spent in called child function. Exclusive refers to the time purely spent in that function, and excludes time spent in called child functions.

The timers are tricky to interpret because the component times don't add up to the total run time (in an obvious way). I looked into this a while back, and I believe this is because of the load imbalance; MAPL handles the process synchronization, so the time a process spends waiting for other processes to finish chemistry is time spent in MAPL (not chemsitry). I believe this sync waiting time is the bulk of the "Run" exclusive time is.

The way I interpret the timers currently is:

If you add these up you get very close to the total run time: 2242.760+926.209+391.262+165.639+1572.164=5298.034. If you include minor parts I left out (Finalize inclusive, Initialize inclusive, SetService inclusive, GCHPctmEnv inclusive) you get even closer (<4 seconds unaccounted for).

jimmielin commented 2 years ago

Thanks @lizziel and @LiamBindle for explaining, this makes it much clearer. The reason I bring this up is because the run results in my particular case ("chemistry" taking 2242s in a total runtime of 5298s) seem to be inconsistent with the results in Eastham et al., 2018 and Bindle et al., 2021, where chemistry is the dominant component.

Looking at your Table 3 in Bindle et al. 2021 (https://gmd.copernicus.org/articles/14/5977/2021/) @LiamBindle may I confirm if you didn't see as much process synchronization time in your run(s) or it was added into the chemistry time? I suppose in the context of MPI it can be argued that synchronization time following chemistry is still chemistry, because some processors are working on chemistry while others are waiting in idle. I was just curious as to how I can interpret this into a breakdown of "chemistry", "dynamics", "data input", and "other" as you did so I can evaluate the speed-up of GCHP when using an optimized version of KPP.

lizziel commented 2 years ago

@jimmielin, could you also report number of cores used, grid resolution, simulation, and duration of run?

jimmielin commented 2 years ago

Hi @lizziel sorry I forgot to report this. Here's my runConfig.sh: it's a standard c24 run for 14 days using 48 cores on huce_cascade.

TOTAL_CORES=48
NUM_NODES=1
NUM_CORES_PER_NODE=48

CS_RES=24
STRETCH_GRID=OFF
STRETCH_FACTOR=2.0
TARGET_LAT=-45.0
TARGET_LON=170.0

Start_Time="20190701 000000"
End_Time="20190715 000000"
Duration="00000014 000000"

# input.geos:
Turn_on_Chemistry=T
Turn_on_Dry_Deposition=T
Turn_on_Wet_Deposition=T
Turn_on_Transport=T
Turn_on_Cloud_Conv=T
Turn_on_PBL_Mixing=T
Turn_on_Non_Local_Mixing=T
#
# HEMCO_Config.rc:
Turn_on_Base_Emissions=true

#---------------------------------------------------------------------
#    Diagnostic frequency, duration, and monthly mean
#---------------------------------------------------------------------
AutoUpdate_Diagnostics=ON

# Monthly diagnostics: '0' for off; '1' for on
timeAvg_monthly="0"

# Frequency and duration (ignored if monthly diagnostics on)
timeAvg_freq="060000"
timeAvg_dur="060000"
LiamBindle commented 2 years ago

@jimmielin Yeah, Table 3 would have lumped Run exlusive into Chemistry. If you aren't specifically interested in the load imbalance, I think it's reasonable to count it under chemsitry since that's the source of the imbalance.

lizziel commented 2 years ago

@jimmielin, I will close this issue now but feel free to reopen if you have more questions.