NCAR / amwg_dev

Repo to store model sandboxes and cases used for CAM development
9 stars 2 forks source link

b.cesm3_cam041_mom.B1850WcMOM.ne30_L58_t061.005 #115

Open cecilehannay opened 2 years ago

cecilehannay commented 2 years ago

Description:

Coupled with CAM-SE-L58 + MOM + CICE6 with BWsc1850 compset

Includes bugfix for the sea-ice to fix excessive fresh water (See https://github.com/NCAR/amwg_dev/discussions/109)

Case directory: Locally (if still available): /glade/p/cesmdata/cseg/runs/cesm2_0/b.cesm3_cam041_mom.B1850WcMOM.ne30_L58_t061.005

On github: https://github.com/NCAR/amwg_dev/tree/b.cesm3_cam041_mom.B1850WcMOM.ne30_L58_t061.005

Sandbox: Locally (if still available): /glade/work/hannay/cesm_tags/cesm3_cam6_3_041_MOM3

On github: https://github.com/NCAR/amwg_dev/tree/cesm3_cam6_3_041_MOM3 hash: 5a56ad3

Diagnostics: AMWG diags (if available) https://webext.cgd.ucar.edu/BWsc1850MOM/b.cesm3_cam041_mom.B1850WcMOM.ne30_L58_t061.005/atm/

Contacts: @cecilehannay and @gustavo-marques @dabail10

cecilehannay commented 2 years ago

After consulting with @tilmes and @drmikemills, we use the 1850_CAM60%WCSC_CLM50%BGC-CROP_CICE_MOM6_MOSART_CISM2%GRIS-NOEVOLVE_SWAV_SESP_BGC%BDRD for the current run.

Note that there is no need to H2O external forcings to the WACCM-SC run. WACCM-SC has CH4 in the chemistry mechanism, so it calculates H2O production from CH4 oxidation. Therefore, we don’t want to add this as an external forcing.

cecilehannay commented 2 years ago

The run crashed in year 21 with the error:

4389:FATAL from PE   789: mpp_sync_self: size_recv does not match of data received
4389:
4389:Image              PC                Routine            Line        Source             
4389:cesm.exe           00000000042F5606  Unknown               Unknown  Unknown
4389:cesm.exe           0000000003DC632D  mpp_mod_mp_mpp_er          68  mpp_util_mpi.inc
4389:cesm.exe           0000000003DA1C8D  mpp_mod_mp_mpp_sy         224  mpp_util_mpi.inc
4389:cesm.exe           0000000003F22640  mpp_domains_mod_m         513  mpp_group_update.h
4389:cesm.exe           0000000002A608D8  mom_domain_infra_        1152  MOM_domain_infra.F90
4389:cesm.exe           000000000312A066  mom_barotropic_mp        1772  MOM_barotropic.F90
4389:cesm.exe           0000000002A847C1  mom_dynamics_spli         808  MOM_dynamics_split_RK2.F90
4389:cesm.exe           000000000285FD0F  mom_mp_step_mom_d        1114  MOM.F90
4389:cesm.exe           0000000002857759  mom_mp_step_mom_          830  MOM.F90
4389:cesm.exe           000000000282B4C0  mom_ocean_model_n         627  mom_ocean_model_nuopc.F90
4389:cesm.exe           00000000027F4616  mom_cap_mod_mp_mo        1684  mom_cap.F90

I am not sure whether mpp_sync_self: size_recv does not match of data received was problem with cheyenne or mom. so, I restarted the run.

@gustavo-marques: could you have a look too?

gustavo-marques commented 2 years ago

Not sure if this is related to the crash, but I see an issue with salt conservation starting at 20/01/21 00:00:00.

gustavo-marques commented 2 years ago

Actually, the issue with salt conservation starts in year 11.

cecilehannay commented 2 years ago

Should I continue the run? Or is it a show stopper?

On Mon, May 16, 2022 at 2:14 PM Gustavo Marques @.***> wrote:

Actually, the issue with salt conservation starts in year 11.

— Reply to this email directly, view it on GitHub https://github.com/NCAR/amwg_dev/issues/115#issuecomment-1128092069, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKF2VA6AO2P6WW33JB46J3VKKUD7ANCNFSM5VDJJLVQ . You are receiving this because you were mentioned.Message ID: @.***>

gustavo-marques commented 2 years ago

The model has been restarted and it has not reached the point where it blew up before. Previously, it blew up during the corrector step call to btstep, in subroutine step_MOM_dyn_split_RK2. The error traces back to a call to do_group_pass (halo updates).

I have not seen this issue before and nothing obvious comes to mind. Let's see if the model will stop again at the same point.

gustavo-marques commented 2 years ago

Let's continue the run.

gustavo-marques commented 2 years ago

@dabail10: can you please check the ice.log to see if there are any conservation issues in CICE6?

dabail10 commented 2 years ago

The salt, heat, and water are fine, but the shortwave looks off.

arwt incoming sw (W) = 6.85618158455401250E+13 1.43538638248901375E+14 arwt absorbed sw (W) = 6.94156657060417812E+13 1.43187650989964625E+14 arwt swdn error = 1.23005355032898781E-02 -2.45123972989368718E-03

These errors should be closer to zero. Is there something going on with nextsw_cday and when the radiation is computed in the atmosphere?

cecilehannay commented 2 years ago

@JulioTBacmeister: could there be a problem with the atmosphere?

@dabail10: did we look at these numbers with cice5? Were there fine? Have you run a G case with MOM6-CICE6 yet?

dabail10 commented 2 years ago

I have not run the G case yet. I will try to set that up tomorrow. This was fine in a B case I did with CICE6 in cesm2.1.

adamrher commented 2 years ago

These errors should be closer to zero. Is there something going on with nextsw_cday and when the radiation is computed in the atmosphere?

I hope there's not a problem w/ nextsw_cday. I spent a lot of time making sure it's correct using the new physics ordering. By correct, I mean that it reflects the zenith angle of the radiative fields actually being sent to the coupler. However, in the new physics ordering here, the radiation is less accurate than the old ordering. We no longer compute radiation just before sending the radiative fluxes to the coupler. They are now computed at the end of the physics time-loop in the prior time-step. Although in both cases its only computed every other time-step. So I might expect an increase in errors but I wouldn't expect it will cause near zero sw errors in the prior ordering, to goto E-2 errors like @dabail10 is showing, using the new physics ordering. but it could be something else I did wrong that I hadn't thought about.

dabail10 commented 2 years ago

It may not be nextsw_cday. It could be the area corrections. Is the radiation timestep the same (1-hour)? We could have a problem with syncing the albedo and the incoming radiation.

adamrher commented 2 years ago

yes, the radiation time-steps are still 1 hour. nextsw_cday is used to sync up the albedo's in CLM, so hopefully if that's correct, everything falls in line.

gustavo-marques commented 2 years ago

This run has passed the previous crashing point (year 21) and it's now at year 27. I still see the salt conservation issue in MOM. This is happening because:

The Labrador Sea is freezing during winter (I looked at year 26 only), but sea ice retracts during summer. As a consequence, mixed layer depths are shallow in the Lab Sea (no convection), which should be affecting AMOC.

Given these and the issue with nextsw_cday, I think we should stop at year 30.

cecilehannay commented 2 years ago

John Fasullo would like to add a few variables for future runs.

gustavo-marques commented 2 years ago

When possible, we use the Climate Model Output Rewriter (CMOR) convention for the MOM variables.

The netCDF files listed below are located at /glade/scratch/hannay/archive/b.cesm3_cam041_mom.B1850WcMOM.ne30_L58_t061.005/ocn/hist/