Open Shixionghu opened 2 years ago
I probably will get the same debug info as my formal case, in my current run with ice_ic=' '... Try to run a month with CICE5+Mushy... Also try to increase the dt_count, although I doubt its' effectiveness...
After setting ice_ic = ''
, the case is running for hours without aborting...
Same for the CICE5+Mushy plan, the case is running for hours but no relevant debug info... But indeed, the previous error disappear
Now, I am trying to run the case with debug=false
, the case is running way too slow.
It works. Do not know why but after setting the debug=false
, the case could run for a month.
Below is the from the timing
folder:
/glade/work/shixiongh/cases/kpg_waccm_test_v1/timing/cesm_timing.kpg_waccm_test_v1.6927424.chadmin1.ib0.cheyenne.ucar.edu.221021-194802
---------------- TIMING PROFILE ---------------------
Case : kpg_waccm_test_v1
LID : 6927424.chadmin1.ib0.cheyenne.ucar.edu.221021-194802
Machine : cheyenne
Caseroot : /glade/work/shixiongh/cases/kpg_waccm_test_v1
Timeroot : /glade/work/shixiongh/cases/kpg_waccm_test_v1/Tools
User : shixiongh
Curr Date : Fri Oct 21 20:27:41 2022
grid : a%1.9x2.5_l%1.9x2.5_oi%gx1v6_r%r05_g%null_w%null_m%gx1v6
compset : 1850_CAM60%WCCM_CLM40%CN_CICE_POP2_RTM_SGLC_SWAV
run_type : hybrid, continue_run = FALSE (inittype = TRUE)
stop_option : nmonths, stop_n = 1
run_length : 31 days (30 for ocean)
component comp_pes root_pe tasks x threads instances (stride)
--------- ------ ------- ------ ------ --------- ------
cpl = cpl 288 0 288 x 1 1 (1 )
atm = cam 288 0 288 x 1 1 (1 )
lnd = clm 144 0 144 x 1 1 (1 )
ice = cice 108 144 108 x 1 1 (1 )
ocn = pop 288 288 288 x 1 1 (1 )
rof = rtm 40 0 40 x 1 1 (1 )
glc = sglc 36 0 36 x 1 1 (1 )
wav = swav 36 252 36 x 1 1 (1 )
esp = sesp 1 0 1 x 1 1 (1 )
total pes active : 576
mpi tasks per node : 36
pe count for cost estimate : 576
Overall Metrics:
Model Cost: 4429.66 pe-hrs/simulated_year
Model Throughput: 3.12 simulated_years/day
Init Time : 20.580 seconds
Run Time : 2351.362 seconds 75.850 seconds/day
Final Time : 0.035 seconds
Actual Ocn Init Wait Time : 2008.531 seconds
Estimated Ocn Init Run Time : 6.214 seconds
Estimated Run Time Correction : 0.000 seconds
(This correction has been applied to the ocean and total run times)
Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components
TOT Run Time: 2351.362 seconds 75.850 seconds/mday 3.12 myears/wday
CPL Run Time: 15.079 seconds 0.486 seconds/mday 486.64 myears/wday
CPL COMM Time: 71.951 seconds 2.321 seconds/mday 101.99 myears/wday
ATM Run Time: 2285.402 seconds 73.723 seconds/mday 3.21 myears/wday
CPL COMM Time: 71.951 seconds 2.321 seconds/mday 101.99 myears/wday
LND Run Time: 9.381 seconds 0.303 seconds/mday 782.23 myears/wday
CPL COMM Time: 71.951 seconds 2.321 seconds/mday 101.99 myears/wday
ICE Run Time: 48.730 seconds 1.572 seconds/mday 150.59 myears/wday
CPL COMM Time: 71.951 seconds 2.321 seconds/mday 101.99 myears/wday
Now, I am trying to start over and try to run a year and see whether it will be stable or not...
Now, I am trying to start over and try to run a year and see whether it will be stable or not...
Succeed. I am able to run for a year. Below is from the timing folder:
---------------- TIMING PROFILE ---------------------
Case : kpg_waccm_test_v1
LID : 6932893.chadmin1.ib0.cheyenne.ucar.edu.221022-063802
Machine : cheyenne
Caseroot : /glade/work/shixiongh/cases/kpg_waccm_test_v1
Timeroot : /glade/work/shixiongh/cases/kpg_waccm_test_v1/Tools
User : shixiongh
Curr Date : Sat Oct 22 14:36:41 2022
grid : a%1.9x2.5_l%1.9x2.5_oi%gx1v6_r%r05_g%null_w%null_m%gx1v6
compset : 1850_CAM60%WCCM_CLM40%CN_CICE_POP2_RTM_SGLC_SWAV
run_type : hybrid, continue_run = FALSE (inittype = TRUE)
stop_option : nyears, stop_n = 1
run_length : 365 days (364 for ocean)
component comp_pes root_pe tasks x threads instances (stride)
--------- ------ ------- ------ ------ --------- ------
cpl = cpl 288 0 288 x 1 1 (1 )
atm = cam 288 0 288 x 1 1 (1 )
lnd = clm 144 0 144 x 1 1 (1 )
ice = cice 108 144 108 x 1 1 (1 )
ocn = pop 288 288 288 x 1 1 (1 )
rof = rtm 40 0 40 x 1 1 (1 )
glc = sglc 36 0 36 x 1 1 (1 )
wav = swav 36 252 36 x 1 1 (1 )
esp = sesp 1 0 1 x 1 1 (1 )
total pes active : 576
mpi tasks per node : 36
pe count for cost estimate : 576
Overall Metrics:
Model Cost: 4590.84 pe-hrs/simulated_year
Model Throughput: 3.01 simulated_years/day
Init Time : 19.886 seconds
Run Time : 28692.781 seconds 78.610 seconds/day
Final Time : 0.005 seconds
Actual Ocn Init Wait Time : 26293.683 seconds
Estimated Ocn Init Run Time : 6.160 seconds
Estimated Run Time Correction : 0.000 seconds
(This correction has been applied to the ocean and total run times)
Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components
TOT Run Time: 28692.781 seconds 78.610 seconds/mday 3.01 myears/wday
CPL Run Time: 208.631 seconds 0.572 seconds/mday 414.13 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
ATM Run Time: 27857.801 seconds 76.323 seconds/mday 3.10 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
LND Run Time: 103.731 seconds 0.284 seconds/mday 832.92 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
ICE Run Time: 628.410 seconds 1.722 seconds/mday 137.49 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
OCN Run Time: 2248.302 seconds 6.160 seconds/mday 38.43 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
ROF Run Time: 37.962 seconds 0.104 seconds/mday 2275.96 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
GLC Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
WAV Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
ESP Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 845.760 seconds 2.317 seconds/mday 102.16 myears/wday
---------------- DRIVER TIMING FLOWCHART ---------------------
NOTE: min:max driver timers (seconds/day):
CPL (pes 0 to 287)
OCN (pes 288 to 575)
LND (pes 0 to 143)
ROF (pes 0 to 39)
ICE (pes 144 to 251)
ATM (pes 0 to 287)
GLC (pes 0 to 35)
WAV (pes 252 to 287)
CPL:CLOCK_ADVANCE 0.004: 0.005
CPL:OCNPRE1 0.027: 0.198
CPL:ATMOCN1 0.022: 0.050
CPL:OCNPREP 0.000: 0.000
CPL:C2O <----> 0.000: 0.001
CPL:LNDPREP 0.001: 0.012
CPL:C2L <----> 0.007: 0.220
CPL:ICEPREP 0.010: 0.021
CPL:C2I <----> 0.003: 0.079
CPL:ROFPREP 0.005: 0.022
CPL:C2R <----> 0.001: 0.012
CPL:ICE_RUN 1.592: 1.722
CPL:LND_RUN 0.257: 0.284
CPL:ROF_RUN 0.074: 0.104
CPL:L2C 0.994: 141.717
delete the comment on tropopause_clim
file, and it could run for a month.
Case directory: /glade/work/shixiongh/cases/kpg_waccm_test_clone/
delete the comment on
tropopause_clim
file, and it could run for a month. Case directory:/glade/work/shixiongh/cases/kpg_waccm_test_clone/
Corresponding timing info:
---------------- TIMING PROFILE ---------------------
Case : kpg_waccm_test_clone
LID : 6944746.chadmin1.ib0.cheyenne.ucar.edu.221023-145803
Machine : cheyenne
Caseroot : /glade/work/shixiongh/cases/kpg_waccm_test_clone
Timeroot : /glade/work/shixiongh/cases/kpg_waccm_test_clone/Tools
User : shixiongh
Curr Date : Sun Oct 23 15:39:10 2022
grid : a%1.9x2.5_l%1.9x2.5_oi%gx1v6_r%r05_g%null_w%null_m%gx1v6
compset : 1850_CAM60%WCCM_CLM40%CN_CICE_POP2_RTM_SGLC_SWAV
run_type : hybrid, continue_run = FALSE (inittype = TRUE)
stop_option : nmonths, stop_n = 1
run_length : 31 days (30 for ocean)
component comp_pes root_pe tasks x threads instances (stride)
--------- ------ ------- ------ ------ --------- ------
cpl = cpl 288 0 288 x 1 1 (1 )
atm = cam 288 0 288 x 1 1 (1 )
lnd = clm 144 0 144 x 1 1 (1 )
ice = cice 108 144 108 x 1 1 (1 )
ocn = pop 288 288 288 x 1 1 (1 )
rof = rtm 40 0 40 x 1 1 (1 )
glc = sglc 36 0 36 x 1 1 (1 )
wav = swav 36 252 36 x 1 1 (1 )
esp = sesp 1 0 1 x 1 1 (1 )
total pes active : 576
mpi tasks per node : 36
pe count for cost estimate : 576
Overall Metrics:
Model Cost: 4596.08 pe-hrs/simulated_year
Model Throughput: 3.01 simulated_years/day
Init Time : 19.989 seconds
Run Time : 2439.700 seconds 78.700 seconds/day
Final Time : 0.006 seconds
Actual Ocn Init Wait Time : 2091.415 seconds
Estimated Ocn Init Run Time : 6.219 seconds
Estimated Run Time Correction : 0.000 seconds
(This correction has been applied to the ocean and total run times)
Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components
TOT Run Time: 2439.700 seconds 78.700 seconds/mday 3.01 myears/wday
CPL Run Time: 19.236 seconds 0.621 seconds/mday 381.48 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
ATM Run Time: 2372.105 seconds 76.520 seconds/mday 3.09 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
LND Run Time: 9.502 seconds 0.307 seconds/mday 772.27 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
ICE Run Time: 49.826 seconds 1.607 seconds/mday 147.27 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
OCN Run Time: 192.791 seconds 6.219 seconds/mday 38.06 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
ROF Run Time: 3.373 seconds 0.109 seconds/mday 2175.54 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
GLC Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
WAV Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
ESP Run Time: 0.000 seconds 0.000 seconds/mday 0.00 myears/wday
CPL COMM Time: 79.954 seconds 2.579 seconds/mday 91.78 myears/wday
---------------- DRIVER TIMING FLOWCHART ---------------------
NOTE: min:max driver timers (seconds/day):
CPL (pes 0 to 287)
OCN (pes 288 to 575)
LND (pes 0 to 143)
ROF (pes 0 to 39)
ICE (pes 144 to 251)
ATM (pes 0 to 287)
GLC (pes 0 to 35)
WAV (pes 252 to 287)
CPL:CLOCK_ADVANCE 0.004: 0.005
CPL:OCNPRE1 0.027: 0.285
CPL:ATMOCN1 0.022: 0.050
CPL:OCNPREP 0.000: 0.000
CPL:C2O <----> 0.000: 0.001
CPL:LNDPREP 0.001: 0.010
CPL:C2L <----> 0.006: 0.295
CPL:ICEPREP 0.010: 0.021
CPL:C2I <----> 0.003: 0.086
CPL:ROFPREP 0.005: 0.028
CPL:C2R <----> 0.002: 0.011
CPL:ICE_RUN 1.556: 1.607
CPL:LND_RUN 0.271: 0.307
CPL:ROF_RUN 0.075: 0.109
CPL:L2C 0.105: 12.851
While we are running the case for testing K-Pg boundary conditions, we had issues with Ice component.
ERROR: ice: Vertical thermo error
One potential way is to switch to mushy, for calculating the freezing point for salt water. Change the
TFREEZE_SALTWATER_OPTION
to mushy. Note that, you probably need to use CICE5 since this is not provided in the CICE4Another way is to increase the dt_count(see more in ice_in file). You could also try to set
ice_ic = ''
Still testing...Wait for the updates and see how it goes...I am running for a month to see how it works...