Shixionghu / K-pg_Setup

K-Pg boundary condition setup
0 stars 0 forks source link

ERROR: Ice Thermo Error #1

Open Shixionghu opened 1 year ago

Shixionghu commented 1 year ago

While we are running the case for testing K-Pg boundary conditions, we had issues with Ice component. ERROR: ice: Vertical thermo error

One potential way is to switch to mushy, for calculating the freezing point for salt water. Change the TFREEZE_SALTWATER_OPTION to mushy. Note that, you probably need to use CICE5 since this is not provided in the CICE4

Another way is to increase the dt_count(see more in ice_in file). You could also try to set ice_ic = ''

Still testing...Wait for the updates and see how it goes...I am running for a month to see how it works...

Shixionghu commented 1 year ago

I probably will get the same debug info as my formal case, in my current run with ice_ic=' '... Try to run a month with CICE5+Mushy... Also try to increase the dt_count, although I doubt its' effectiveness...

Shixionghu commented 1 year ago

After setting ice_ic = '', the case is running for hours without aborting...
Same for the CICE5+Mushy plan, the case is running for hours but no relevant debug info... But indeed, the previous error disappear
Now, I am trying to run the case with debug=false, the case is running way too slow.

Shixionghu commented 1 year ago

It works. Do not know why but after setting the debug=false, the case could run for a month.

Below is the from the timing folder: /glade/work/shixiongh/cases/kpg_waccm_test_v1/timing/cesm_timing.kpg_waccm_test_v1.6927424.chadmin1.ib0.cheyenne.ucar.edu.221021-194802

---------------- TIMING PROFILE ---------------------
  Case        : kpg_waccm_test_v1
  LID         : 6927424.chadmin1.ib0.cheyenne.ucar.edu.221021-194802
  Machine     : cheyenne
  Caseroot    : /glade/work/shixiongh/cases/kpg_waccm_test_v1
  Timeroot    : /glade/work/shixiongh/cases/kpg_waccm_test_v1/Tools
  User        : shixiongh
  Curr Date   : Fri Oct 21 20:27:41 2022
  grid        : a%1.9x2.5_l%1.9x2.5_oi%gx1v6_r%r05_g%null_w%null_m%gx1v6
  compset     : 1850_CAM60%WCCM_CLM40%CN_CICE_POP2_RTM_SGLC_SWAV
  run_type    : hybrid, continue_run = FALSE (inittype = TRUE)
  stop_option : nmonths, stop_n = 1
  run_length  : 31 days (30 for ocean)

  component       comp_pes    root_pe   tasks  x threads instances (stride)
  ---------        ------     -------   ------   ------  ---------  ------
  cpl = cpl        288         0        288    x 1       1      (1     )
  atm = cam        288         0        288    x 1       1      (1     )
  lnd = clm        144         0        144    x 1       1      (1     )
  ice = cice       108         144      108    x 1       1      (1     )
  ocn = pop        288         288      288    x 1       1      (1     )
  rof = rtm        40          0        40     x 1       1      (1     )
  glc = sglc       36          0        36     x 1       1      (1     )
  wav = swav       36          252      36     x 1       1      (1     )
  esp = sesp       1           0        1      x 1       1      (1     )

  total pes active           : 576
  mpi tasks per node               : 36
  pe count for cost estimate : 576

  Overall Metrics:
    Model Cost:            4429.66   pe-hrs/simulated_year
    Model Throughput:         3.12   simulated_years/day

    Init Time   :      20.580 seconds
    Run Time    :    2351.362 seconds       75.850 seconds/day
    Final Time  :       0.035 seconds

    Actual Ocn Init Wait Time     :    2008.531 seconds
    Estimated Ocn Init Run Time   :       6.214 seconds
    Estimated Run Time Correction :       0.000 seconds
      (This correction has been applied to the ocean and total run times)

Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components

    TOT Run Time:    2351.362 seconds       75.850 seconds/mday         3.12 myears/wday
    CPL Run Time:      15.079 seconds        0.486 seconds/mday       486.64 myears/wday
    CPL COMM Time:     71.951 seconds        2.321 seconds/mday       101.99 myears/wday
    ATM Run Time:    2285.402 seconds       73.723 seconds/mday         3.21 myears/wday
    CPL COMM Time:     71.951 seconds        2.321 seconds/mday       101.99 myears/wday
    LND Run Time:       9.381 seconds        0.303 seconds/mday       782.23 myears/wday
    CPL COMM Time:     71.951 seconds        2.321 seconds/mday       101.99 myears/wday
    ICE Run Time:      48.730 seconds        1.572 seconds/mday       150.59 myears/wday
    CPL COMM Time:     71.951 seconds        2.321 seconds/mday       101.99 myears/wday
Shixionghu commented 1 year ago

Now, I am trying to start over and try to run a year and see whether it will be stable or not...

Shixionghu commented 1 year ago

Now, I am trying to start over and try to run a year and see whether it will be stable or not...

Succeed. I am able to run for a year. Below is from the timing folder:

---------------- TIMING PROFILE ---------------------
  Case        : kpg_waccm_test_v1
  LID         : 6932893.chadmin1.ib0.cheyenne.ucar.edu.221022-063802
  Machine     : cheyenne
  Caseroot    : /glade/work/shixiongh/cases/kpg_waccm_test_v1
  Timeroot    : /glade/work/shixiongh/cases/kpg_waccm_test_v1/Tools
  User        : shixiongh
  Curr Date   : Sat Oct 22 14:36:41 2022
  grid        : a%1.9x2.5_l%1.9x2.5_oi%gx1v6_r%r05_g%null_w%null_m%gx1v6
  compset     : 1850_CAM60%WCCM_CLM40%CN_CICE_POP2_RTM_SGLC_SWAV
  run_type    : hybrid, continue_run = FALSE (inittype = TRUE)
  stop_option : nyears, stop_n = 1
  run_length  : 365 days (364 for ocean)

  component       comp_pes    root_pe   tasks  x threads instances (stride)
  ---------        ------     -------   ------   ------  ---------  ------
  cpl = cpl        288         0        288    x 1       1      (1     )
  atm = cam        288         0        288    x 1       1      (1     )
  lnd = clm        144         0        144    x 1       1      (1     )
  ice = cice       108         144      108    x 1       1      (1     )
  ocn = pop        288         288      288    x 1       1      (1     )
  rof = rtm        40          0        40     x 1       1      (1     )
  glc = sglc       36          0        36     x 1       1      (1     )
  wav = swav       36          252      36     x 1       1      (1     )
  esp = sesp       1           0        1      x 1       1      (1     )

  total pes active           : 576
  mpi tasks per node               : 36
  pe count for cost estimate : 576

  Overall Metrics:
    Model Cost:            4590.84   pe-hrs/simulated_year
    Model Throughput:         3.01   simulated_years/day

    Init Time   :      19.886 seconds
    Run Time    :   28692.781 seconds       78.610 seconds/day
    Final Time  :       0.005 seconds

    Actual Ocn Init Wait Time     :   26293.683 seconds
    Estimated Ocn Init Run Time   :       6.160 seconds
    Estimated Run Time Correction :       0.000 seconds
      (This correction has been applied to the ocean and total run times)

Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components

    TOT Run Time:   28692.781 seconds       78.610 seconds/mday         3.01 myears/wday
    CPL Run Time:     208.631 seconds        0.572 seconds/mday       414.13 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    ATM Run Time:   27857.801 seconds       76.323 seconds/mday         3.10 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    LND Run Time:     103.731 seconds        0.284 seconds/mday       832.92 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    ICE Run Time:     628.410 seconds        1.722 seconds/mday       137.49 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    OCN Run Time:    2248.302 seconds        6.160 seconds/mday        38.43 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    ROF Run Time:      37.962 seconds        0.104 seconds/mday      2275.96 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    GLC Run Time:       0.000 seconds        0.000 seconds/mday         0.00 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    WAV Run Time:       0.000 seconds        0.000 seconds/mday         0.00 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday
    ESP Run Time:       0.000 seconds        0.000 seconds/mday         0.00 myears/wday
    CPL COMM Time:    845.760 seconds        2.317 seconds/mday       102.16 myears/wday

---------------- DRIVER TIMING FLOWCHART ---------------------

   NOTE: min:max driver timers (seconds/day):
                            CPL (pes 0 to 287)
                                                                                       OCN (pes 288 to 575)
                                                LND (pes 0 to 143)
                                                ROF (pes 0 to 39)
                                                                   ICE (pes 144 to 251)
                                                ATM (pes 0 to 287)
                                                GLC (pes 0 to 35)
                                                                                  WAV (pes 252 to 287)

  CPL:CLOCK_ADVANCE           0.004:   0.005
  CPL:OCNPRE1                 0.027:   0.198
  CPL:ATMOCN1                 0.022:   0.050
  CPL:OCNPREP                 0.000:   0.000
  CPL:C2O                        <---->                                                  0.000:   0.001
  CPL:LNDPREP                 0.001:   0.012
  CPL:C2L                        <---->           0.007:   0.220
  CPL:ICEPREP                 0.010:   0.021
  CPL:C2I                        <---->                              0.003:   0.079
  CPL:ROFPREP                 0.005:   0.022
  CPL:C2R                        <---->           0.001:   0.012
  CPL:ICE_RUN                                                        1.592:   1.722
  CPL:LND_RUN                                     0.257:   0.284
  CPL:ROF_RUN                                     0.074:   0.104
  CPL:L2C                                         0.994: 141.717
Shixionghu commented 1 year ago

delete the comment on tropopause_clim file, and it could run for a month.
Case directory: /glade/work/shixiongh/cases/kpg_waccm_test_clone/

Shixionghu commented 1 year ago

delete the comment on tropopause_clim file, and it could run for a month. Case directory: /glade/work/shixiongh/cases/kpg_waccm_test_clone/

Corresponding timing info:

---------------- TIMING PROFILE ---------------------
  Case        : kpg_waccm_test_clone
  LID         : 6944746.chadmin1.ib0.cheyenne.ucar.edu.221023-145803
  Machine     : cheyenne
  Caseroot    : /glade/work/shixiongh/cases/kpg_waccm_test_clone
  Timeroot    : /glade/work/shixiongh/cases/kpg_waccm_test_clone/Tools
  User        : shixiongh
  Curr Date   : Sun Oct 23 15:39:10 2022
  grid        : a%1.9x2.5_l%1.9x2.5_oi%gx1v6_r%r05_g%null_w%null_m%gx1v6
  compset     : 1850_CAM60%WCCM_CLM40%CN_CICE_POP2_RTM_SGLC_SWAV
  run_type    : hybrid, continue_run = FALSE (inittype = TRUE)
  stop_option : nmonths, stop_n = 1
  run_length  : 31 days (30 for ocean)

  component       comp_pes    root_pe   tasks  x threads instances (stride)
  ---------        ------     -------   ------   ------  ---------  ------
  cpl = cpl        288         0        288    x 1       1      (1     )
  atm = cam        288         0        288    x 1       1      (1     )
  lnd = clm        144         0        144    x 1       1      (1     )
  ice = cice       108         144      108    x 1       1      (1     )
  ocn = pop        288         288      288    x 1       1      (1     )
  rof = rtm        40          0        40     x 1       1      (1     )
  glc = sglc       36          0        36     x 1       1      (1     )
  wav = swav       36          252      36     x 1       1      (1     )
  esp = sesp       1           0        1      x 1       1      (1     )

  total pes active           : 576
  mpi tasks per node               : 36
  pe count for cost estimate : 576

  Overall Metrics:
    Model Cost:            4596.08   pe-hrs/simulated_year
    Model Throughput:         3.01   simulated_years/day

    Init Time   :      19.989 seconds
    Run Time    :    2439.700 seconds       78.700 seconds/day
    Final Time  :       0.006 seconds

    Actual Ocn Init Wait Time     :    2091.415 seconds
    Estimated Ocn Init Run Time   :       6.219 seconds
    Estimated Run Time Correction :       0.000 seconds
      (This correction has been applied to the ocean and total run times)

Runs Time in total seconds, seconds/model-day, and model-years/wall-day
CPL Run Time represents time in CPL pes alone, not including time associated with data exchange with other components

    TOT Run Time:    2439.700 seconds       78.700 seconds/mday         3.01 myears/wday
    CPL Run Time:      19.236 seconds        0.621 seconds/mday       381.48 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    ATM Run Time:    2372.105 seconds       76.520 seconds/mday         3.09 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    LND Run Time:       9.502 seconds        0.307 seconds/mday       772.27 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    ICE Run Time:      49.826 seconds        1.607 seconds/mday       147.27 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    OCN Run Time:     192.791 seconds        6.219 seconds/mday        38.06 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    ROF Run Time:       3.373 seconds        0.109 seconds/mday      2175.54 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    GLC Run Time:       0.000 seconds        0.000 seconds/mday         0.00 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    WAV Run Time:       0.000 seconds        0.000 seconds/mday         0.00 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday
    ESP Run Time:       0.000 seconds        0.000 seconds/mday         0.00 myears/wday
    CPL COMM Time:     79.954 seconds        2.579 seconds/mday        91.78 myears/wday

---------------- DRIVER TIMING FLOWCHART ---------------------

   NOTE: min:max driver timers (seconds/day):
                            CPL (pes 0 to 287)
                                                                                       OCN (pes 288 to 575)
                                                LND (pes 0 to 143)
                                                ROF (pes 0 to 39)
                                                                   ICE (pes 144 to 251)
                                                ATM (pes 0 to 287)
                                                GLC (pes 0 to 35)
                                                                                  WAV (pes 252 to 287)

  CPL:CLOCK_ADVANCE           0.004:   0.005
  CPL:OCNPRE1                 0.027:   0.285
  CPL:ATMOCN1                 0.022:   0.050
  CPL:OCNPREP                 0.000:   0.000
  CPL:C2O                        <---->                                                  0.000:   0.001
  CPL:LNDPREP                 0.001:   0.010
  CPL:C2L                        <---->           0.006:   0.295
  CPL:ICEPREP                 0.010:   0.021
  CPL:C2I                        <---->                              0.003:   0.086
  CPL:ROFPREP                 0.005:   0.028
  CPL:C2R                        <---->           0.002:   0.011
  CPL:ICE_RUN                                                        1.556:   1.607
  CPL:LND_RUN                                     0.271:   0.307
  CPL:ROF_RUN                                     0.075:   0.109
  CPL:L2C                                         0.105:  12.851