Closed yantosca closed 3 years ago
For some background, here is a slide showing the relative cost of various floating point operations (normalized to +, which only takes a couple of CPU cycles).
I used the TAU Profiler to assess the performance of the KPP rate law functions with useless computations removed vs. the dev/13.0.0-final branch. Results are below.
SLURM JobID # : 16102823
Job Name : geoschem.run
Submit time : 2021-02-05 06:38:23
Start time : 2021-02-05 06:38:31
End time : 2021-02-05 07:26:57
Partition : huce_casc+
Node : holy7c20409
CPUs : 48
Memory : 0.0364 GB
CPU Time : 1-09:26:31 ( 120391 s)
Wall Time : 00:48:26 ( 2906 s)
CPU Time / Wall Time : 41.4284 ( 86.31% ideal)
Ref Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-00:46:28.078 2788.078
Initialization : 00-00:01:01.187 61.188
HEMCO : 00-00:08:04.765 484.766
All chemistry : 00-00:19:17.046 1157.047
=> FlexChem : 00-00:13:45.171 825.172
-> FlexChem loop : 00-00:13:26.093 806.094
-> Init KPP : 00-00:00:03.647 3.648
-> Het chem rates : 00-00:00:49.864 49.865
-> Photolysis rates : 00-00:00:04.626 4.626
-> KPP : 00-00:08:15.338 495.338
RCONST : 00-00:01:07.659 67.660
Integrate 1 : 00-00:06:49.706 409.707
Integrate 2 : >>>>> THE TIMER DID NOT RUN <<<<<
-> Prod/loss diags : 00-00:00:03.707 3.707
-> OH reactivity diag : >>>>> THE TIMER DID NOT RUN <<<<<
=> FAST-JX photolysis : 00-00:01:20.468 80.469
=> Aerosol chem : 00-00:03:56.140 236.141
=> Linearized strat chem : 00-00:00:04.468 4.469
Transport : 00-00:04:25.656 265.656
Convection : 00-00:02:43.203 163.203
Boundary layer mixing : 00-00:04:14.359 254.359
Dry deposition : 00-00:00:06.968 6.969
Wet deposition : 00-00:01:09.765 69.766
All diagnostics : 00-00:05:24.921 324.922
=> HEMCO diagnostics : 00-00:00:00.015 0.016
=> ObsPack diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> History (netCDF diags) : 00-00:05:24.890 324.891
Unit conversions : 00-00:01:37.671 97.672
Input : 00-00:00:29.671 29.672
Output : 00-00:05:24.734 324.734
Finalization : 00-00:00:07.671 7.672
SLURM JobID # : 16119635
Job Name : geoschem.run
Submit time : 2021-02-05 07:55:53
Start time : 2021-02-05 07:56:31
End time : 2021-02-05 08:55:30
Partition : huce_casc+
Node : holy7c20210
CPUs : 48
Memory : 0.0346 GB
CPU Time : 1-14:21:24 ( 138084 s)
Wall Time : 00:58:59 ( 3539 s)
CPU Time / Wall Time : 39.0178 ( 81.29% ideal)
Dev Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-00:56:28.078 3388.078
Initialization : 00-00:01:00.906 60.906
HEMCO : 00-00:08:03.578 483.578
All chemistry : 00-00:29:03.343 1743.344
=> FlexChem : 00-00:22:26.578 1346.578
-> FlexChem loop : 00-00:22:06.046 1326.047
-> Init KPP : 00-00:00:06.705 6.706
-> Het chem rates : 00-00:00:45.867 45.868
-> Photolysis rates : 00-00:00:04.769 4.769
-> KPP : 00-**:45:52.*** **************
RCONST : 00-00:00:33.606 33.606
Integrate 1 : 00-00:12:20.858 740.859
Integrate 2 : >>>>> THE TIMER DID NOT RUN <<<<<
-> Prod/loss diags : 00-00:00:03.694 3.695
-> OH reactivity diag : >>>>> THE TIMER DID NOT RUN <<<<<
=> FAST-JX photolysis : 00-00:01:21.781 81.781
=> Aerosol chem : 00-00:04:58.046 298.047
=> Linearized strat chem : 00-00:00:05.515 5.516
Transport : 00-00:04:28.328 268.328
Convection : 00-00:02:44.562 164.562
Boundary layer mixing : 00-00:04:19.375 259.375
Dry deposition : 00-00:00:06.531 6.531
Wet deposition : 00-00:01:12.437 72.438
All diagnostics : 00-00:05:27.953 327.953
=> HEMCO diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> ObsPack diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> History (netCDF diags) : 00-00:05:27.953 327.953
Unit conversions : 00-00:01:42.906 102.906
Input : 00-00:00:29.203 29.203
Output : 00-00:05:27.812 327.812
Finalization : 00-00:00:07.734 7.734
NOTE: Bug in placement of KPP timer call, will be fixed in next version
Metric | Ref | Dev |
---|---|---|
Mass-weighted mean OH concentration | 10.90705577712 x 10^5 molec cm-3 | 10.90701569309 x 10^5 molec cm-3 |
CH3CCl3 lifetime w/r/t tropospheric OH | 5.7756 years | 5.7757 years |
CH4 lifetime w/r/t tropospheric OH | 9.7264 years | 9.7264 years |
7-day mean species concentration, summed over all grid boxes
Variable Ref=refrun_7d_omp3 Dev=devrun_7d_omp3 Dev-Ref
SpeciesConc_A3O2 : 1.5073348e-09 | 1.5073339e-09 | -8.881784197001252e-16
SpeciesConc_ACET : 0.00017001003 | 0.00017001003 | 0.0
SpeciesConc_ACTA : 4.8985858e-06 | 4.898587e-06 | 1.3642420526593924e-12
SpeciesConc_AERI : 6.854665e-08 | 6.854668e-08 | 3.552713678800501e-14
SpeciesConc_ALD2 : 8.112423e-06 | 8.112424e-06 | 1.8189894035458565e-12
SpeciesConc_ALK4 : 1.9099514e-05 | 1.9099509e-05 | -5.4569682106375694e-12
SpeciesConc_ATO2 : 8.129286e-09 | 8.129294e-09 | 7.993605777301127e-15
SpeciesConc_ATOOH : 2.2103636e-06 | 2.2103638e-06 | 2.2737367544323206e-13
SpeciesConc_B3O2 : 5.4175526e-09 | 5.4175473e-09 | -5.329070518200751e-15
SpeciesConc_BCPI : 6.386458e-06 | 6.386458e-06 | 0.0
SpeciesConc_BCPO : 1.6477707e-06 | 1.6477707e-06 | 0.0
SpeciesConc_BENZ : 7.721492e-06 | 7.721494e-06 | 1.8189894035458565e-12
SpeciesConc_BRO2 : 1.6369257e-09 | 1.6369248e-09 | -8.881784197001252e-16
SpeciesConc_Br : 6.5420386e-07 | 6.5420386e-07 | 0.0
SpeciesConc_Br2 : 4.294419e-08 | 4.2943796e-08 | -3.943512183468556e-13
SpeciesConc_BrCl : 2.9278328e-07 | 2.9278195e-07 | -1.3358203432289883e-12
SpeciesConc_BrNO2 : 3.2915843e-09 | 3.291584e-09 | -4.440892098500626e-16
SpeciesConc_BrNO3 : 4.2179417e-07 | 4.217947e-07 | 5.400124791776761e-13
SpeciesConc_BrO : 9.697598e-07 | 9.697593e-07 | -4.547473508864641e-13
SpeciesConc_BrSALA : 1.294285e-08 | 1.2943e-08 | 1.4921397450962104e-13
SpeciesConc_BrSALC : 1.02111954e-07 | 1.02112246e-07 | 2.913225216616411e-13
SpeciesConc_C2H6 : 9.7722106e-05 | 9.772209e-05 | -1.4551915228366852e-11
SpeciesConc_C3H8 : 1.24340195e-05 | 1.2434018e-05 | -1.8189894035458565e-12
SpeciesConc_C4HVP1 : 1.344095e-11 | 1.344092e-11 | -2.949029909160572e-17
SpeciesConc_C4HVP2 : 4.4018376e-11 | 4.401823e-11 | -1.457167719820518e-16
SpeciesConc_CCl4 : 9.928283e-06 | 9.928283e-06 | 0.0
SpeciesConc_CFC11 : 2.8912835e-05 | 2.8912835e-05 | 0.0
SpeciesConc_CFC113 : 9.5496935e-06 | 9.5496935e-06 | 0.0
SpeciesConc_CFC114 : 2.5807567e-06 | 2.5807567e-06 | 0.0
SpeciesConc_CFC115 : 1.8629286e-06 | 1.8629286e-06 | 0.0
SpeciesConc_CFC12 : 7.1108894e-05 | 7.1108894e-05 | 0.0
SpeciesConc_CH2Br2 : 9.480521e-08 | 9.4805195e-08 | -1.4210854715202004e-14
SpeciesConc_CH2Cl2 : 4.069369e-06 | 4.069369e-06 | 0.0
SpeciesConc_CH2I2 : 1.6246651e-10 | 1.6246651e-10 | 0.0
SpeciesConc_CH2IBr : 3.5140446e-10 | 3.514045e-10 | 2.7755575615628914e-17
SpeciesConc_CH2ICl : 2.516635e-09 | 2.516635e-09 | 0.0
SpeciesConc_CH2O : 3.10896e-05 | 3.10896e-05 | 0.0
SpeciesConc_CH2OO : 2.910663e-14 | 2.9106617e-14 | -1.3552527156068805e-20
SpeciesConc_CH3Br : 8.0297417e-07 | 8.0297417e-07 | 0.0
SpeciesConc_CH3CCl3 : 2.6677307e-07 | 2.6677307e-07 | 0.0
SpeciesConc_CH3CHOO : 2.6855886e-13 | 2.6855905e-13 | 1.8973538018496328e-19
SpeciesConc_CH3Cl : 7.0313225e-05 | 7.0313225e-05 | 0.0
SpeciesConc_CH3I : 3.921293e-08 | 3.9212928e-08 | -3.552713678800501e-15
SpeciesConc_CH4 : 0.29982316 | 0.29982322 | 5.960464477539063e-08
SpeciesConc_CHBr3 : 7.739167e-08 | 7.739167e-08 | 0.0
SpeciesConc_CHCl3 : 1.1112453e-06 | 1.1112453e-06 | 0.0
SpeciesConc_CLOCK : 1811767900000.0 | 1811767900000.0 | 0.0
SpeciesConc_CO : 0.010680627 | 0.010680627 | 0.0
SpeciesConc_CO2 : 3.4490836e-06 | 3.4490793e-06 | -4.320099833421409e-12
SpeciesConc_Cl : 2.5359993e-06 | 2.5359993e-06 | 0.0
SpeciesConc_Cl2 : 1.775434e-06 | 1.775439e-06 | 5.002220859751105e-12
SpeciesConc_Cl2O2 : 2.1621177e-06 | 2.1621074e-06 | -1.0231815394945443e-11
SpeciesConc_ClNO2 : 7.3256534e-09 | 7.3256556e-09 | 2.220446049250313e-15
SpeciesConc_ClNO3 : 2.739727e-05 | 2.7397316e-05 | 4.729372449219227e-11
SpeciesConc_ClO : 1.3318807e-05 | 1.3318772e-05 | -3.456079866737127e-11
SpeciesConc_ClOO : 6.488198e-11 | 6.488199e-11 | 6.938893903907228e-18
SpeciesConc_DMS : 6.1525543e-06 | 6.152555e-06 | 9.094947017729282e-13
SpeciesConc_DST1 : 9.1736416e-05 | 9.1736416e-05 | 0.0
SpeciesConc_DST2 : 9.420402e-05 | 9.420402e-05 | 0.0
SpeciesConc_DST3 : 0.0001495286 | 0.0001495286 | 0.0
SpeciesConc_DST4 : 0.00010904716 | 0.00010904716 | 0.0
SpeciesConc_EOH : 3.3119904e-06 | 3.31199e-06 | -4.547473508864641e-13
SpeciesConc_ETHLN : 7.680528e-09 | 7.680507e-09 | -2.1316282072803006e-14
SpeciesConc_ETNO3 : 2.5880027e-07 | 2.5880036e-07 | 8.526512829121202e-14
SpeciesConc_ETO2 : 6.885982e-09 | 6.8859793e-09 | -2.6645352591003757e-15
SpeciesConc_ETP : 1.5796717e-06 | 1.5796718e-06 | 1.1368683772161603e-13
SpeciesConc_GLYC : 1.0556133e-06 | 1.055613e-06 | -3.410605131648481e-13
SpeciesConc_GLYX : 7.744487e-08 | 7.7444824e-08 | -4.263256414560601e-14
SpeciesConc_H : 2.4845294e-08 | 2.4845296e-08 | 1.7763568394002505e-15
SpeciesConc_H1211 : 4.1805941e-07 | 4.1805941e-07 | 0.0
SpeciesConc_H1301 : 4.356534e-07 | 4.356534e-07 | 0.0
SpeciesConc_H2 : 0.09770402 | 0.09770402 | 0.0
SpeciesConc_H2402 : 4.9276956e-08 | 4.9276956e-08 | 0.0
SpeciesConc_H2O : 605.02454 | 605.02454 | 0.0
SpeciesConc_H2O2 : 5.0814597e-05 | 5.0814626e-05 | 2.9103830456733704e-11
SpeciesConc_HAC : 1.9350687e-06 | 1.9350684e-06 | -2.2737367544323206e-13
SpeciesConc_HBr : 1.2455374e-07 | 1.2455334e-07 | -3.979039320256561e-13
SpeciesConc_HC5A : 3.8371372e-08 | 3.837134e-08 | -3.197442310920451e-14
SpeciesConc_HCFC123 : 1.1198907e-26 | 1.1198907e-26 | 0.0
SpeciesConc_HCFC141b : 3.323583e-06 | 3.323583e-06 | 0.0
SpeciesConc_HCFC142b : 3.855489e-06 | 3.855489e-06 | 0.0
SpeciesConc_HCFC22 : 4.079312e-05 | 4.079312e-05 | 0.0
SpeciesConc_HCOOH : 3.943688e-06 | 3.9436895e-06 | 1.3642420526593924e-12
SpeciesConc_HCl : 0.0002608105 | 0.00026081048 | -2.9103830456733704e-11
SpeciesConc_HI : 1.3974245e-09 | 1.3974212e-09 | -3.3306690738754696e-15
SpeciesConc_HMHP : 5.5293265e-07 | 5.5293276e-07 | 1.1368683772161603e-13
SpeciesConc_HMML : 1.2449348e-07 | 1.2449358e-07 | 9.947598300641403e-14
SpeciesConc_HNO2 : 2.085627e-07 | 2.085624e-07 | -2.984279490192421e-13
SpeciesConc_HNO3 : 0.00027082945 | 0.00027082904 | -4.0745362639427185e-10
SpeciesConc_HNO4 : 5.250747e-06 | 5.2507426e-06 | -4.547473508864641e-12
SpeciesConc_HO2 : 3.4715563e-06 | 3.4715567e-06 | 4.547473508864641e-13
SpeciesConc_HOBr : 1.856996e-07 | 1.8569973e-07 | 1.2789769243681803e-13
SpeciesConc_HOCl : 4.992537e-06 | 4.99258e-06 | 4.320099833421409e-11
SpeciesConc_HOI : 3.709452e-08 | 3.7094598e-08 | 7.815970093361102e-14
SpeciesConc_HONIT : 1.5436838e-08 | 1.5436841e-08 | 3.552713678800501e-15
SpeciesConc_HPALD1 : 1.7814134e-08 | 1.7814141e-08 | 7.105427357601002e-15
SpeciesConc_HPALD1OO : 2.9613336e-11 | 2.9613333e-11 | -3.469446951953614e-18
SpeciesConc_HPALD2 : 5.7168e-08 | 5.7167984e-08 | -1.4210854715202004e-14
SpeciesConc_HPALD2OO : 1.0042704e-10 | 1.0042699e-10 | -4.85722573273506e-17
SpeciesConc_HPALD3 : 1.505839e-08 | 1.5058392e-08 | 1.7763568394002505e-15
SpeciesConc_HPALD4 : 4.144201e-08 | 4.1441997e-08 | -1.4210854715202004e-14
SpeciesConc_HPETHNL : 4.150748e-08 | 4.150747e-08 | -1.0658141036401503e-14
SpeciesConc_I : 1.631796e-08 | 1.6317959e-08 | -1.7763568394002505e-15
SpeciesConc_I2 : 1.3201323e-10 | 1.3201329e-10 | 5.551115123125783e-17
SpeciesConc_I2O2 : 8.249814e-11 | 8.2498466e-11 | 3.2612801348363973e-16
SpeciesConc_I2O3 : 4.186887e-10 | 4.1868667e-10 | -2.0261570199409107e-15
SpeciesConc_I2O4 : 2.2240976e-11 | 2.2239868e-11 | -1.1084883011491797e-15
SpeciesConc_IBr : 1.0656108e-09 | 1.0655772e-09 | -3.3639757646142243e-14
SpeciesConc_ICHE : 9.5115965e-08 | 9.5115944e-08 | -2.1316282072803006e-14
SpeciesConc_ICHOO : 6.850254e-12 | 6.8502578e-12 | 3.903127820947816e-18
SpeciesConc_ICN : 5.8332816e-08 | 5.8332756e-08 | -6.039613253960852e-14
SpeciesConc_ICNOO : 2.206408e-10 | 2.206401e-10 | -6.938893903907228e-16
SpeciesConc_ICPDH : 2.1435431e-08 | 2.1435437e-08 | 5.329070518200751e-15
SpeciesConc_ICl : 8.292727e-09 | 8.292725e-09 | -2.6645352591003757e-15
SpeciesConc_IDC : 4.5451184e-08 | 4.5451163e-08 | -2.1316282072803006e-14
SpeciesConc_IDCHP : 1.31170514e-08 | 1.31170514e-08 | 0.0
SpeciesConc_IDHDP : 4.533032e-08 | 4.5330314e-08 | -7.105427357601002e-15
SpeciesConc_IDHNBOO : 4.5150745e-09 | 4.515092e-09 | 1.7319479184152442e-14
SpeciesConc_IDHNDOO1 : 7.747121e-12 | 7.747132e-12 | 1.1275702593849246e-17
SpeciesConc_IDHNDOO2 : 7.789414e-12 | 7.789416e-12 | 1.734723475976807e-18
SpeciesConc_IDHPE : 2.079703e-07 | 2.0797029e-07 | -1.4210854715202004e-14
SpeciesConc_IDN : 1.1843922e-08 | 1.1843837e-08 | -8.526512829121202e-14
SpeciesConc_IDNOO : 6.687698e-12 | 6.687642e-12 | -5.594483210025203e-17
SpeciesConc_IEPOXA : 6.938548e-07 | 6.9385493e-07 | 1.1368683772161603e-13
SpeciesConc_IEPOXAOO : 3.6544327e-11 | 3.6544316e-11 | -1.0408340855860843e-17
SpeciesConc_IEPOXB : 3.9932186e-07 | 3.9932195e-07 | 8.526512829121202e-14
SpeciesConc_IEPOXBOO : 1.4625547e-11 | 1.4625542e-11 | -5.204170427930421e-18
SpeciesConc_IEPOXD : 3.5711096e-08 | 3.5711086e-08 | -1.0658141036401503e-14
SpeciesConc_IHN1 : 1.11093215e-08 | 1.1109301e-08 | -2.042810365310288e-14
SpeciesConc_IHN2 : 1.676428e-08 | 1.6764258e-08 | -2.3092638912203256e-14
SpeciesConc_IHN3 : 1.48638195e-08 | 1.4863826e-08 | 6.217248937900877e-15
SpeciesConc_IHN4 : 2.7609388e-09 | 2.7609341e-09 | -4.6629367034256575e-15
SpeciesConc_IHOO1 : 9.202143e-09 | 9.202142e-09 | -8.881784197001252e-16
SpeciesConc_IHOO4 : 2.5301403e-09 | 2.5301414e-09 | 1.1102230246251565e-15
SpeciesConc_IHPNBOO : 5.6970834e-12 | 5.697081e-12 | -2.6020852139652106e-18
SpeciesConc_IHPNDOO : 2.2034603e-11 | 2.2034577e-11 | -2.6020852139652106e-17
SpeciesConc_IHPOO1 : 7.9493884e-11 | 7.9493856e-11 | -2.7755575615628914e-17
SpeciesConc_IHPOO2 : 1.936254e-11 | 1.9362548e-11 | 8.673617379884035e-18
SpeciesConc_IHPOO3 : 1.1914021e-10 | 1.1914024e-10 | 2.7755575615628914e-17
SpeciesConc_INA : 1.2893193e-17 | 1.28932146e-17 | 2.150669592637872e-23
SpeciesConc_INDIOL : 4.6126215e-06 | 4.6126215e-06 | 0.0
SpeciesConc_INO : 1.0768647e-11 | 1.0768593e-11 | -5.4643789493269423e-17
SpeciesConc_INO2B : 5.9356746e-09 | 5.935704e-09 | 2.930988785010413e-14
SpeciesConc_INO2D : 4.584526e-09 | 4.5845483e-09 | 2.220446049250313e-14
SpeciesConc_INPB : 2.0133172e-08 | 2.0133129e-08 | -4.263256414560601e-14
SpeciesConc_INPD : 2.221306e-08 | 2.2212992e-08 | -6.750155989720952e-14
SpeciesConc_IO : 1.8674838e-08 | 1.8674836e-08 | -1.7763568394002505e-15
SpeciesConc_IONITA : 4.583741e-08 | 4.5837393e-08 | -1.7763568394002505e-14
SpeciesConc_IONO : 8.5785895e-10 | 8.578429e-10 | -1.6042722705833512e-14
SpeciesConc_IONO2 : 1.7657648e-08 | 1.7657669e-08 | 2.1316282072803006e-14
SpeciesConc_IPRNO3 : 2.3726938e-07 | 2.3726943e-07 | 5.684341886080802e-14
SpeciesConc_ISALA : 7.671415e-09 | 7.671395e-09 | -1.9539925233402755e-14
SpeciesConc_ISALC : 4.689541e-09 | 4.6895337e-09 | -7.549516567451064e-15
SpeciesConc_ISOP : 2.8535464e-06 | 2.8535478e-06 | 1.3642420526593924e-12
SpeciesConc_ISOPNOO1 : 2.8340656e-11 | 2.8340667e-11 | 1.0408340855860843e-17
SpeciesConc_ISOPNOO2 : 2.5500598e-11 | 2.550063e-11 | 3.122502256758253e-17
SpeciesConc_ITCN : 1.5890372e-08 | 1.5890363e-08 | -8.881784197001252e-15
SpeciesConc_ITHN : 3.7235566e-08 | 3.723552e-08 | -4.618527782440651e-14
SpeciesConc_KO2 : 1.1812668e-08 | 1.1812665e-08 | -3.552713678800501e-15
SpeciesConc_LBRO2H : 1.6923642e-06 | 1.6923643e-06 | 1.1368683772161603e-13
SpeciesConc_LBRO2N : 1.324323e-06 | 1.3243202e-06 | -2.7284841053187847e-12
SpeciesConc_LCH4 : 1.829087e-06 | 1.8290865e-06 | -4.547473508864641e-13
SpeciesConc_LCO : 3.1677687e-06 | 3.1677641e-06 | -4.547473508864641e-12
SpeciesConc_LIMO : 1.0944953e-08 | 1.09449605e-08 | 7.105427357601002e-15
SpeciesConc_LIMO2 : 2.03324e-10 | 2.0332393e-10 | -6.938893903907228e-17
SpeciesConc_LISOPNO3 : 1.6943817e-08 | 1.6943789e-08 | -2.842170943040401e-14
SpeciesConc_LISOPOH : 1.939704e-07 | 1.9397034e-07 | -5.684341886080802e-14
SpeciesConc_LNRO2H : 3.1392792e-15 | 3.1392767e-15 | -2.541098841762901e-21
SpeciesConc_LNRO2N : 3.9500094e-15 | 3.95001e-15 | 4.235164736271502e-22
SpeciesConc_LOx : 0.0025104273 | 0.0025104273 | 0.0
SpeciesConc_LTRO2H : 9.306493e-07 | 9.3064915e-07 | -1.7053025658242404e-13
SpeciesConc_LTRO2N : 1.2669665e-06 | 1.2669663e-06 | -2.2737367544323206e-13
SpeciesConc_LVOC : 3.6051256e-10 | 3.605121e-10 | -4.718447854656915e-16
SpeciesConc_LVOCOA : 9.26511e-08 | 9.265106e-08 | -4.263256414560601e-14
SpeciesConc_LXRO2H : 3.6937772e-07 | 3.6937774e-07 | 2.842170943040401e-14
SpeciesConc_LXRO2N : 9.545257e-07 | 9.545257e-07 | 0.0
SpeciesConc_MACR : 1.1534095e-06 | 1.1534095e-06 | 0.0
SpeciesConc_MACR1OO : 6.082224e-10 | 6.0822153e-10 | -8.881784197001252e-16
SpeciesConc_MACR1OOH : 4.9530666e-08 | 4.9530676e-08 | 1.0658141036401503e-14
SpeciesConc_MACRNO2 : 3.3421742e-12 | 3.3421783e-12 | 4.119968255444917e-18
SpeciesConc_MAP : 5.2784267e-06 | 5.2784294e-06 | 2.7284841053187847e-12
SpeciesConc_MCO3 : 1.9888374e-08 | 1.988839e-08 | 1.5987211554602254e-14
SpeciesConc_MCRDH : 7.481803e-08 | 7.4818054e-08 | 2.1316282072803006e-14
SpeciesConc_MCRENOL : 1.3894064e-08 | 1.389406e-08 | -3.552713678800501e-15
SpeciesConc_MCRHN : 2.227734e-09 | 2.2277358e-09 | 1.7763568394002505e-15
SpeciesConc_MCRHNB : 7.320646e-09 | 7.320592e-09 | -5.3734794391857577e-14
SpeciesConc_MCRHP : 6.1903194e-08 | 6.190318e-08 | -1.4210854715202004e-14
SpeciesConc_MCROHOO : 4.8687106e-11 | 4.86871e-11 | -6.938893903907228e-18
SpeciesConc_MEK : 8.744737e-06 | 8.74474e-06 | 2.7284841053187847e-12
SpeciesConc_MENO3 : 5.5629187e-07 | 5.56292e-07 | 1.1368683772161603e-13
SpeciesConc_MGLY : 5.2363265e-07 | 5.236324e-07 | -2.2737367544323206e-13
SpeciesConc_MO2 : 5.962356e-07 | 5.9623557e-07 | -5.684341886080802e-14
SpeciesConc_MOH : 4.2313055e-05 | 4.2313048e-05 | -7.275957614183426e-12
SpeciesConc_MONITA : 6.0623293e-09 | 6.0623284e-09 | -8.881784197001252e-16
SpeciesConc_MONITS : 1.3743057e-08 | 1.3743056e-08 | -1.7763568394002505e-15
SpeciesConc_MONITU : 1.8386385e-09 | 1.8386376e-09 | -8.881784197001252e-16
SpeciesConc_MP : 4.537546e-05 | 4.5375455e-05 | -3.637978807091713e-12
SpeciesConc_MPAN : 6.746832e-08 | 6.746838e-08 | 5.684341886080802e-14
SpeciesConc_MPN : 5.2643595e-06 | 5.2643554e-06 | -4.092726157978177e-12
SpeciesConc_MSA : 6.351092e-07 | 6.351092e-07 | 0.0
SpeciesConc_MTPA : 2.7747421e-07 | 2.7747433e-07 | 1.1368683772161603e-13
SpeciesConc_MTPO : 1.0865065e-07 | 1.0865068e-07 | 2.842170943040401e-14
SpeciesConc_MVK : 2.0747066e-06 | 2.0747057e-06 | -9.094947017729282e-13
SpeciesConc_MVKDH : 3.7804625e-07 | 3.7804637e-07 | 1.1368683772161603e-13
SpeciesConc_MVKHC : 1.2195021e-07 | 1.2195021e-07 | 0.0
SpeciesConc_MVKHCB : 1.04707624e-07 | 1.04707624e-07 | 0.0
SpeciesConc_MVKHP : 1.2856889e-07 | 1.2856889e-07 | 0.0
SpeciesConc_MVKN : 1.3970917e-08 | 1.39709195e-08 | 2.6645352591003757e-15
SpeciesConc_MVKOHOO : 5.166217e-09 | 5.1662137e-09 | -3.1086244689504383e-15
SpeciesConc_MVKPC : 3.7494566e-08 | 3.7494548e-08 | -1.7763568394002505e-14
SpeciesConc_N : 2.7872848e-10 | 2.787285e-10 | 2.7755575615628914e-17
SpeciesConc_N2 : 152574.56 | 152574.56 | 0.0
SpeciesConc_N2O : 0.04729683 | 0.04729683 | 0.0
SpeciesConc_N2O5 : 2.9692097e-05 | 2.9692108e-05 | 1.0913936421275139e-11
SpeciesConc_NAP : 5.0156987e-16 | 5.0157183e-16 | 1.9587636905255695e-21
SpeciesConc_NH3 : 1.1689421e-05 | 1.1689383e-05 | -3.728928277269006e-11
SpeciesConc_NH4 : 1.6433529e-05 | 1.643356e-05 | 3.092281986027956e-11
SpeciesConc_NIT : 7.3448487e-06 | 7.345165e-06 | 3.1650415621697903e-10
SpeciesConc_NITs : 1.6229379e-06 | 1.6229383e-06 | 3.410605131648481e-13
SpeciesConc_NO : 0.00038284084 | 0.00038284084 | 0.0
SpeciesConc_NO2 : 0.00034411164 | 0.00034411164 | 0.0
SpeciesConc_NO3 : 2.2070844e-06 | 2.2070851e-06 | 6.821210263296962e-13
SpeciesConc_NPRNO3 : 6.073425e-08 | 6.0734266e-08 | 1.4210854715202004e-14
SpeciesConc_NRO2 : 2.254625e-16 | 2.2546283e-16 | 3.441071348220595e-22
SpeciesConc_O : 0.0005328089 | 0.0005328089 | 0.0
SpeciesConc_O1D : 7.073275e-11 | 7.073275e-11 | 0.0
SpeciesConc_O2 : 40937.98 | 40937.98 | 0.0
SpeciesConc_O3 : 0.40913835 | 0.40913835 | 0.0
SpeciesConc_OCPI : 4.4510627e-05 | 4.4510627e-05 | 0.0
SpeciesConc_OCPO : 8.271655e-06 | 8.271655e-06 | 0.0
SpeciesConc_OCS : 6.511356e-05 | 6.511356e-05 | 0.0
SpeciesConc_OClO : 8.8407427e-07 | 8.840627e-07 | -1.1596057447604835e-11
SpeciesConc_OH : 4.173913e-06 | 4.1739136e-06 | 4.547473508864641e-13
SpeciesConc_OIO : 3.4973133e-09 | 3.4972856e-09 | -2.7755575615628914e-14
SpeciesConc_OLND : 1.9704007e-09 | 1.9704003e-09 | -4.440892098500626e-16
SpeciesConc_OLNN : 2.6162356e-10 | 2.6162353e-10 | -2.7755575615628914e-17
SpeciesConc_OTHRO2 : 9.778939e-09 | 9.778926e-09 | -1.3322676295501878e-14
SpeciesConc_PAN : 8.04386e-06 | 8.043861e-06 | 9.094947017729282e-13
SpeciesConc_PCO : 2.694404e-06 | 2.6944035e-06 | -4.547473508864641e-13
SpeciesConc_PH2O2 : 6.542564e-07 | 6.542566e-07 | 2.2737367544323206e-13
SpeciesConc_PIO2 : 2.4084585e-09 | 2.4084572e-09 | -1.3322676295501878e-15
SpeciesConc_PIP : 7.6556034e-07 | 7.655602e-07 | -1.7053025658242404e-13
SpeciesConc_PO2 : 3.5877883e-09 | 3.5877854e-09 | -2.886579864025407e-15
SpeciesConc_POx : 0.002511227 | 0.002511227 | 0.0
SpeciesConc_PP : 3.8874563e-07 | 3.8874572e-07 | 8.526512829121202e-14
SpeciesConc_PPN : 1.2329156e-06 | 1.2329157e-06 | 1.1368683772161603e-13
SpeciesConc_PRN1 : 1.4278808e-09 | 1.4278839e-09 | 3.1086244689504383e-15
SpeciesConc_PROPNN : 8.439241e-08 | 8.439219e-08 | -2.2026824808563106e-13
SpeciesConc_PRPE : 1.5127832e-06 | 1.5127843e-06 | 1.0231815394945443e-12
SpeciesConc_PRPN : 4.423394e-09 | 4.423396e-09 | 1.7763568394002505e-15
SpeciesConc_PSO4 : 9.988107e-09 | 9.9881055e-09 | -1.7763568394002505e-15
SpeciesConc_PYAC : 2.9168117e-08 | 2.9168087e-08 | -3.019806626980426e-14
SpeciesConc_R4N1 : 1.1661354e-10 | 1.1661365e-10 | 1.1102230246251565e-16
SpeciesConc_R4N2 : 2.1221092e-07 | 2.1221089e-07 | -2.842170943040401e-14
SpeciesConc_R4O2 : 1.1953593e-08 | 1.1953577e-08 | -1.5987211554602254e-14
SpeciesConc_R4P : 6.364866e-07 | 6.36486e-07 | -6.252776074688882e-13
SpeciesConc_RA3P : 2.3240496e-07 | 2.3240509e-07 | 1.2789769243681803e-13
SpeciesConc_RB3P : 5.536681e-07 | 5.536684e-07 | 2.8421709430404007e-13
SpeciesConc_RCHO : 1.0937521e-06 | 1.0937524e-06 | 3.410605131648481e-13
SpeciesConc_RCO3 : 2.0894324e-09 | 2.0894328e-09 | 4.440892098500626e-16
SpeciesConc_RCOOH : 1.9540794e-15 | 1.9540794e-15 | 0.0
SpeciesConc_RIPA : 6.1311675e-07 | 6.131165e-07 | -2.2737367544323206e-13
SpeciesConc_RIPB : 1.2943111e-07 | 1.294311e-07 | -1.4210854715202004e-14
SpeciesConc_RIPC : 1.8536946e-08 | 1.8536932e-08 | -1.4210854715202004e-14
SpeciesConc_RIPD : 7.929148e-09 | 7.929144e-09 | -4.440892098500626e-15
SpeciesConc_ROH : 7.780372e-08 | 7.78037e-08 | -2.1316282072803006e-14
SpeciesConc_RP : 4.449378e-07 | 4.4493785e-07 | 5.684341886080802e-14
SpeciesConc_SALA : 1.1958645e-05 | 1.1958645e-05 | 0.0
SpeciesConc_SALAAL : 9.312363e-08 | 9.312533e-08 | 1.6981971384666394e-12
SpeciesConc_SALACL : 2.8998e-06 | 2.8997981e-06 | -1.8189894035458565e-12
SpeciesConc_SALC : 0.00013495881 | 0.00013495881 | 0.0
SpeciesConc_SALCAL : 2.7237016e-05 | 2.7237009e-05 | -7.275957614183426e-12
SpeciesConc_SALCCL : 6.350825e-05 | 6.3508254e-05 | 7.275957614183426e-12
SpeciesConc_SO2 : 8.231267e-06 | 8.231267e-06 | 0.0
SpeciesConc_SO4 : 2.9016528e-05 | 2.9016524e-05 | -3.637978807091713e-12
SpeciesConc_SO4H1 : 4.3056e-26 | 4.3056e-26 | 0.0
SpeciesConc_SO4H2 : 4.3056e-26 | 4.3056e-26 | 0.0
SpeciesConc_SO4H3 : 4.3056e-26 | 4.3056e-26 | 0.0
SpeciesConc_SO4H4 : 4.3056e-26 | 4.3056e-26 | 0.0
SpeciesConc_SO4s : 3.3033018e-08 | 3.3033075e-08 | 5.684341886080802e-14
SpeciesConc_SOAGX : 3.8212812e-07 | 3.8212823e-07 | 1.1368683772161603e-13
SpeciesConc_SOAIE : 1.7106585e-06 | 1.7106582e-06 | -2.2737367544323206e-13
SpeciesConc_SOAP : 1.0303932e-06 | 1.0303932e-06 | 0.0
SpeciesConc_SOAS : 6.4829173e-06 | 6.4829173e-06 | 0.0
SpeciesConc_TOLU : 1.0836219e-06 | 1.083622e-06 | 1.1368683772161603e-13
SpeciesConc_TRO2 : 9.080887e-10 | 9.0808744e-10 | -1.27675647831893e-15
SpeciesConc_XRO2 : 4.183484e-10 | 4.183482e-10 | -2.220446049250313e-16
SpeciesConc_XYLE : 1.6607714e-07 | 1.660772e-07 | 5.684341886080802e-14
SpeciesConc_pFe : 1.3768651e-08 | 1.3768651e-08 | 0.0
Differences indicate numerical noise (1e-6 or 1e-7 times lower than values) caused by the forcing of REAL*8 precision in KPP rate law functions. This is to be expected
7-day mean KPP statistics, summed over all grid boxes
Variable Ref=refrun_7d_omp3 Dev=devrun_7d_omp3 Dev-Ref
AREA : 510065600000000.0 | 510065600000000.0 | 0.0
KppAccSteps : 502781.97 | 502790.22 | 8.25
KppIntCounts : 1723891.1 | 1723937.6 | 46.5
KppJacCounts : 502781.97 | 502790.22 | 8.25
KppLuDecomps : 610554.44 | 610573.8 | 19.375
KppRejSteps : 728.83923 | 730.14294 | 1.3037109375
KppSubsts : 2442217.8 | 2442295.2 | 77.5
KppTotSteps : 610554.44 | 610573.8 | 19.375
Thoughts? @msulprizio, @WilliamDowns, @sdeastham, @kilcolmu, @wporter, @davenhenze, @kelvinhb, @vshah, @LiamBindle
I think going forward I should restore the original rate law functions, as incurring a 9 minute/day penalty is not sustainable. Then using that we can start bringing other reactions from sulfate_mod etc. into KPP.
Tagging @kilicomu again for any thoughts (the username tagged above is incorrect)
Wow! This was a real surprise. I guess though that RCONST
was not taking the majority of the integration time, so fixing the rate law calculations (and it does look like this substantially improved that part) was only going to be able to yield a minor improvement at best. Is it possible that the slowdown in integration is due to the forcing to double precision? Presumably there is potentially some knock-on effect for situations where all of the inputs, internal calculations, or even outputs are REAL*4, even if the calculation at the end is nominally REAL*8.
I also have a ticket into FASRC to see if it is something to do with hyperthreading on the huce_cascade cluster. If you notice in Dev the RCONST timer is almost exactly a factor of 2 smaller than in Ref. That might have something to do with the hyperthreading.
@sdeastham: Even in the old code the output of the rate law functions was double precision (i.e.REAL(kind=dp) FUNCTION GCARR( A0,B0,C0 )
. So I don't think that's it.
For reference here are some articles about hyperthreading and OpenMP that I found online:
Also from Reference 4 above:
OpenMP and hyperthreading really don't go together. Your OpenMP compiler and runtime scheduler is highly unlikely to know enough about the sharing characteristics to do a good job of managing affinity and sharing. It's almost certainly treating your hyperthreaded CPU as a dual processor SMP; and that's a losing proposition because it's simply not.
This is interesting, and the hyperthreading does appear to be the most likely culprit. Could you rerun both cases with HT turned off?
Plamen Krastev suggests:
Hi Bob,
I have your request and I will answer your previous questions separately. Regarding the OpenMP hyperthreading questions, you can try exporting the OMP_CPU_AFFINITY environment variable, e.g.,
export OMP_CPU_AFFINITY="0-23"
This should restrict the OMP execution to the available physical cores only. You also need to include in the job submission script:
srun -c 24 ...
and also export the OMP_NUM_THREADS environment variable, i.e.,
export OMP_NUM_THREADS=24
This should work, but please let me know if you continue to have problems with this. Have a nice weekend.
Best, Plamen
Interesting! If all else fails, compiling without OpenMP and doing serial tests may shine some light too.
Rerunning with the new commands now. I wonder if this affects other studies that were done on huce_cascade, like the recent benchmark output (not for results but for timings).
Also more info from Plamen at FAS Research computing:
You may also try
export OMP_PLACES=cores export OMP_NUM_THREADS=24
instead of
export OMP_CPU_AFFINITY="0-23" export OMP_NUM_THREADS=24>
to pin the threads to physical cores only, as this is a change in the OpenMP 5.0 spec.
Also tagging @viral211 on this thread.
I ran a couple more profiling runs with hyperthreading turned off on the huce_cascade partition on slurm (see discussion thread #614). These are the results from the GEOS-Chem timers, which simply measure the time elapsed across certain blocks of code:
Ref (dev/13.0.0-final development branch)
---------------------------------------------------------------------
All chemistry : 00-00:32:04.015 1924.016
=> FlexChem : 00-00:23:13.593 1393.594
-> FlexChem loop : 00-00:22:42.015 1362.016
-> Init KPP : 00-00:00:07.282 7.283
-> Het chem rates : 00-00:01:40.944 100.944
-> Photolysis rates : 00-00:00:10.691 10.691
-> KPP : 00-00:16:31.617 991.618
RCONST : 00-00:02:15.709 135.710 <-- update reaction rates
Integrate 1 : 00-00:13:39.219 819.219 <-- call Rosenbrock solver
-> Prod/loss diags : 00-00:00:07.559 7.559
Dev (useless computations removed)
----------------------------------------------------------------------
All chemistry : 00-00:44:23.209 2663.210
=> FlexChem : 00-00:35:51.850 2151.850
-> FlexChem loop : 00-00:35:22.292 2122.293
-> Init KPP : 00-00:00:14.033 14.034
-> Het chem rates : 00-00:01:38.072 98.072
-> Photolysis rates : 00-00:00:11.080 11.080
-> KPP : 00-00:27:24.559 1644.560
RCONST : 00-00:01:10.356 70.357 <-- update reaction rates
Integrate 1 : 00-00:25:39.289 1539.290 <-- call Rosenbrock solver
-> Prod/loss diags : 00-00:00:07.639 7.640
As you can see, we drastically reduce the time spent in computing reaction rates but basically double the time spent in the Rosenbrock solver. This is a consistent result that has happened across many of my tests. I will investigate what happens if I restore the original rate law functions.
Perhaps it is an issue in parallelization (in the Dev code there are more THREADPRIVATE variables included in the gckpp_Global code. That might be causing this issue.
I was able to run simulations on our private node (Intel Haswell cores, no hyperthreading). In both cases the jobs were the only one running on the node.
===============================================================================
G E O S - C H E M T I M E R S
Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-01:19:50.632 4790.633
Initialization : 00-00:00:55.480 55.480
HEMCO : 00-00:09:15.957 555.957
All chemistry : 00-00:40:35.640 2435.641
=> FlexChem : 00-00:29:02.304 1742.305
-> FlexChem loop : 00-00:28:22.804 1702.805
-> Init KPP : 00-00:00:07.275 7.275
-> Het chem rates : 00-00:02:07.709 127.710
-> Photolysis rates : 00-00:00:08.967 8.967
-> KPP : 00-00:20:55.183 1255.184
RCONST : 00-00:02:41.072 161.072 <--- compute rxn rates
Integrate 1 : 00-00:17:24.631 1044.631 <--- Rosenbrock solver
Integrate 2 : >>>>> THE TIMER DID NOT RUN <<<<<
-> Prod/loss diags : 00-00:00:07.357 7.357
-> OH reactivity diag : >>>>> THE TIMER DID NOT RUN <<<<<
=> FAST-JX photolysis : 00-00:02:34.269 154.270
=> Aerosol chem : 00-00:08:26.617 506.617
=> Linearized strat chem : 00-00:00:09.480 9.480
Transport : 00-00:07:22.527 442.527
Convection : 00-00:04:15.332 255.332
Boundary layer mixing : 00-00:07:39.187 459.188
Dry deposition : 00-00:00:28.855 28.855
Wet deposition : 00-00:02:27.917 147.918
All diagnostics : 00-00:06:03.878 363.879
=> HEMCO diagnostics : 00-00:00:00.023 0.023
=> ObsPack diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> History (netCDF diags) : 00-00:06:03.859 363.859
Unit conversions : 00-00:03:24.289 204.289
Input : 00-00:00:35.640 35.641
Output : 00-00:06:03.535 363.535
Finalization : 00-00:00:09.796 9.797
===============================================================================
G E O S - C H E M T I M E R S
Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-01:36:55.019 5815.020
Initialization : 00-00:00:59.238 59.238
HEMCO : 00-00:09:20.628 560.629
All chemistry : 00-00:57:25.511 3445.512
=> FlexChem : 00-00:45:54.480 2754.480
-> FlexChem loop : 00-00:45:14.343 2714.344
-> Init KPP : 00-00:00:16.211 16.211
-> Het chem rates : 00-00:02:04.679 124.680
-> Photolysis rates : 00-00:00:09.288 9.288
-> KPP : 00-00:35:26.214 2126.214
RCONST : 00-00:01:33.282 93.282 <--- compute rxn rates
Integrate 1 : 00-00:33:07.628 1987.629 <--- Rosenbrock solver
Integrate 2 : >>>>> THE TIMER DID NOT RUN <<<<<
-> Prod/loss diags : 00-00:00:07.625 7.626
-> OH reactivity diag : >>>>> THE TIMER DID NOT RUN <<<<<
=> FAST-JX photolysis : 00-00:02:34.835 154.836
=> Aerosol chem : 00-00:08:22.753 502.754
=> Linearized strat chem : 00-00:00:10.128 10.129
Transport : 00-00:07:19.382 439.383
Convection : 00-00:04:16.531 256.531
Boundary layer mixing : 00-00:07:43.855 463.855
Dry deposition : 00-00:00:28.511 28.512
Wet deposition : 00-00:02:28.191 148.191
All diagnostics : 00-00:06:12.210 372.211
=> HEMCO diagnostics : 00-00:00:00.007 0.008
=> ObsPack diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> History (netCDF diags) : 00-00:06:12.218 372.219
Unit conversions : 00-00:03:30.085 210.086
Input : 00-00:00:37.035 37.035
Output : 00-00:06:11.894 371.895
Finalization : 00-00:00:08.039 8.039
The Dev code is faster in computing the reaction rates, but this somehow causes the integration to take almost 2x as long. I am still investigating.
I just discovered that the KPP code that I had built had an excess call to UPDATE_RCONST()
within the gckpp_Integrator.F90
module. This should not have been there. We call UPDATE_RCONST()
outside of the internal integration timestep loop... this is done from within the main loop in DO_FLEXCHEM
before the Integrator is called. These excess calls to UPDATE_RCONST
(one per internal integration iteration were slowing down the code and masking the gains obtained by streamlining the rate law functions.
I am doing a new profiling run with corrected code that should show the real impact of streamlining the rate law functions. I'll post the results here.
Here are the latest timing results (7 model-day simulations) The Dev run has the excess calls to UPDATE_RCONST() removed:
SLURM JobID # : 16918083
Job Name : geoschem.profile.run
Submit time : 2021-02-09 17:53:37
Start time : 2021-02-09 17:55:01
End time : 2021-02-09 19:03:43
Partition : huce_casc+
Node : holy7c20202
CPUs : 24
Memory : 0.0357 GB
CPU Time : 1-00:53:51 ( 89631 s)
Wall Time : 01:08:42 ( 4122 s)
CPU Time / Wall Time : 21.7445 ( 90.60% ideal)
===============================================================================
G E O S - C H E M T I M E R S
Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-01:06:59.000 4019.000
Initialization : 00-00:00:48.953 48.953
HEMCO : 00-00:08:04.515 484.516
All chemistry : 00-00:32:04.015 1924.016
=> FlexChem : 00-00:23:13.593 1393.594
-> FlexChem loop : 00-00:22:42.015 1362.016
-> Init KPP : 00-00:00:07.282 7.283
-> Het chem rates : 00-00:01:40.944 100.944
-> Photolysis rates : 00-00:00:10.691 10.691
-> KPP : 00-00:16:31.617 991.618
RCONST : 00-00:02:15.709 135.710 <-- Update rxn rates
Integrate 1 : 00-00:13:39.219 819.219 <-- Call Rosenbrock solver
Integrate 2 : >>>>> THE TIMER DID NOT RUN <<<<<
-> Prod/loss diags : 00-00:00:07.559 7.559
-> OH reactivity diag : >>>>> THE TIMER DID NOT RUN <<<<<
=> FAST-JX photolysis : 00-00:01:59.093 119.094
=> Aerosol chem : 00-00:06:20.843 380.844
=> Linearized strat chem : 00-00:00:05.609 5.609
Transport : 00-00:06:23.078 383.078
Convection : 00-00:03:59.750 239.750
Boundary layer mixing : 00-00:06:21.828 381.828
Dry deposition : 00-00:00:11.671 11.672
Wet deposition : 00-00:02:09.437 129.438
All diagnostics : 00-00:06:07.656 367.656
=> HEMCO diagnostics : 00-00:00:00.031 0.031
=> ObsPack diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> History (netCDF diags) : 00-00:06:07.625 367.625
Unit conversions : 00-00:03:54.375 234.375
Input : 00-00:00:31.187 31.188
Output : 00-00:06:07.484 367.484
Finalization : 00-00:00:07.703 7.703
SLURM JobID # : 17043612
Job Name : geoschem.profile.run
Submit time : 2021-02-11 17:22:55
Start time : 2021-02-11 17:23:43
End time : 2021-02-11 18:27:53
Partition : huce_casc+
Node : holy7c20612
CPUs : 24
Memory : 0.0346 GB
CPU Time : 23:16:13 ( 83773 s)
Wall Time : 01:04:10 ( 3850 s)
CPU Time / Wall Time : 21.7592 ( 90.66% ideal)
===============================================================================
G E O S - C H E M T I M E R S
Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-01:02:23.320 3743.320
Initialization : 00-00:00:58.148 58.148
HEMCO : 00-00:07:55.570 475.570
All chemistry : 00-00:30:04.664 1804.664
=> FlexChem : 00-00:22:08.570 1328.570
-> FlexChem loop : 00-00:21:40.867 1300.867
-> Init KPP : 00-00:00:12.893 12.893
-> Het chem rates : 00-00:01:34.672 94.673
-> Photolysis rates : 00-00:00:09.426 9.426
-> KPP : 00-00:15:46.220 946.220
RCONST : 00-00:01:15.690 75.691 <--- Update rxn rates
Integrate 1 : 00-00:13:57.386 837.387 <--- Call Rosenbrock solver
Integrate 2 : >>>>> THE TIMER DID NOT RUN <<<<<
-> Prod/loss diags : 00-00:00:07.652 7.652
-> OH reactivity diag : >>>>> THE TIMER DID NOT RUN <<<<<
=> FAST-JX photolysis : 00-00:01:57.351 117.352
=> Aerosol chem : 00-00:05:31.062 331.062
=> Linearized strat chem : 00-00:00:05.460 5.461
Transport : 00-00:06:03.039 363.039
Convection : 00-00:03:40.093 220.094
Boundary layer mixing : 00-00:05:23.539 323.539
Dry deposition : 00-00:00:11.656 11.656
Wet deposition : 00-00:02:02.140 122.141
All diagnostics : 00-00:05:34.148 334.148
=> HEMCO diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> ObsPack diagnostics : >>>>> THE TIMER DID NOT RUN <<<<<
=> History (netCDF diags) : 00-00:05:34.132 334.133
Unit conversions : 00-00:03:22.804 202.805
Input : 00-00:00:28.593 28.594
Output : 00-00:05:34.054 334.055
The updates that were made to remove the useless computations in the KPP rate law library take about half as long as before (75 seconds vs. 135 seconds). But the overall time spent in the Rosenbrock solver does not change very much. This results in ~4.5 minutes of wall clock time savings for a 7-day run. This would be almost 20 minutes of time savings for a 1-month (31-day) benchmark simulation.
This feature request was addressed by https://github.com/geoschem/geos-chem/pull/663, which has been merged into the dev
branch. This update will be included in 13.1.0.
The problem
This topic was introduced in discussion thread #567. Long story short, the existing KPP equations and rate-law libraries use
REAL*4
constants (e.g. 1.0 or 1.0E+00) that are cast to double precision. Not only does this involve a loss of precision, but this involves a lot of useless CPU cycles.For example, this equation (in
fullchem.eqn
) calls function GCARR (ingckpp.kpp
) to compute the reaction rate according tot he Arrhenius law:where function GCARR looks like this:
Problems with this approach:
REAL
, which on most compilers default toREAL*4
unless you specify otherwise. Those are casted up to double-precision with theDBLE
function, which incurs extra CPU cycles that are not needed.(300/TEMP)**b0
term will evaluate to 1. Even so, the code must do a division and then a raising to a power, which are among the more expensive operations in terms of CPU cycles.The solution
We have rewritten the rate law functions so that
REAL*8
) constants in the equation definition infullchem.eqn
, which eliminates the need for explicit castingGCARR_ab
when C0=0 orGCARR_ac
when B0=0.So we can now write the reaction rate like this:
where:
We have also renamed the function to denote which parameters are being passed:
_ac
means that it accept the Arrhenius A0 and C0 terms, and B0 is zero.