ufs-community / ufs-weather-model

UFS Weather Model
Other
134 stars 240 forks source link

Coupled run using benchmark at C384 crashed when frac_grid=T #268

Closed ShanSunNOAA closed 3 years ago

ShanSunNOAA commented 3 years ago

Description

The coupled model runs well in the benchmark case at C384 with frac_grid=F & frac_grid_input=T. It crashed with frac_grid=T, on line 778 in module_gfdl_cloud_microphys.F90 where dz (i, j, k) becomes zero somewhere: 776 dz0 (k) = dz (i, j, k) 777
778 den0 (k) = - dp1 (k) / (grav * dz0 (k)) ! density of dry air

To Reproduce:

This can be reproduced in https://github.com/shansun6/ufs-weather-model, -b frac_bm_20201108. To run it, do rt.sh -l rt.conf_bmark

Hera keeps frozen today, so I don't have output yet. When it is back to normal, I will add the output dir here.

yangfanglin commented 3 years ago

Shan, Can you repeat this run in debug mode to get more information ? How soon the model crashed ?

ShanSunNOAA commented 3 years ago

Hi Fanglin,

Thanks for your email. The model crashed right away during the 1st time step. The error message of dz=0 on Line 778 in GFDL microphysics is from the debug mode. I will let you know when I have more info.

Thanks, Shan

On Tue, Nov 10, 2020 at 10:16 AM Fanglin Yang notifications@github.com wrote:

Shan, Can you repeat this run in debug mode to get more information ? How soon the model crashed ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-724843337, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVUWYBEW4YWCFSR4GRDSPFYM3ANCNFSM4TQH56ZA .

ShanSunNOAA commented 3 years ago

I made one modification today: slmsk=floor(landfrac) when frac_grid=T, to be consistent with ICs.

However, the model still crashed at the same place(Line 778 of module_gfdl_cloud_microphys.F90) during the 1st time step. The error using debug=Y is at /scratch2/BMC/gsd-fv3-dev/Shan.Sun/FV3_RT/rt_94187/cpld_bmark_frac_prod/ on hera.

junwang-noaa commented 3 years ago

Shan, the dz is computed from interface pressure phii in module_gfdl_cloud_microphys.F90. The phii is updated in get_phi_fv3, and it should not have same value at two consecutive layers unless the tmp (gt0) or there are two levels with same pressure in model physics state. I'd suggest to find the (i,j) location of dz=0 in dz(i,k) = (phii(i,kk)-phii(i,kk+1))*onebg in module_gfdl_cloud_microphys.F90, then check it in get_phi_fv3_run to see where tmp becomes 0.

ShanSunNOAA commented 3 years ago

Jun, thanks for your suggestion. I inserted a print statement after dz(i,k) = (phii(i,kk)-phii(i,kk+1))onebg in gfdl_cloud_microphys.F90:
if (abs(dz(i,k))<1.e-12) write(
,'(a,2i4,a,2es10.2,2(a,es10.2))') 'warning1 dz=0 at i,k=',i,k,' phii=',phii(i,kk),phii(i,kk+1),' dz=',dz(i,k)

Here is the output: phii went bad at many points and different k:

109: warning1 dz=0 at i,k= 8 1 phii= 2.55E+71 2.55E+71 dz= 0.00E+00 109: warning1 dz=0 at i,k= 10 1 phii= 1.16E+73 1.16E+73 dz= 0.00E+00 109: warning1 dz=0 at i,k= 14 1 phii= 1.98E+72 1.98E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 15 1 phii= 3.17E+72 3.17E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 8 2 phii= 2.55E+71 2.55E+71 dz= 0.00E+00 109: warning1 dz=0 at i,k= 10 2 phii= 1.16E+73 1.16E+73 dz= 0.00E+00 109: warning1 dz=0 at i,k= 14 2 phii= 1.98E+72 1.98E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 15 2 phii= 3.17E+72 3.17E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 8 3 phii= 2.55E+71 2.55E+71 dz= 0.00E+00 109: warning1 dz=0 at i,k= 10 3 phii= 1.16E+73 1.16E+73 dz= 0.00E+00 . . .

see /scratch2/BMC/gsd-fv3-dev/Shan.Sun/FV3_RT/rt_73625/cpld_bmark_frac_prod/out

Also the error in "err" has switched to a different routine (it is no longer Line 778 of module_gfdl_cloud_microphys.F90):

139: forrtl: error (72): floating overflow 139: Image PC Routine Line Source 139: fv3.exe 000000000D25E6BF Unknown Unknown Unknown 139: libpthread-2.17.s 00002B7DB5698630 Unknown Unknown Unknown 139: fv3.exe 0000000005EE862F samfshalcnv_mp_sa 776 samfshalcnv.f

where Line 776 in samfshalcnv.f is the calculation of eta: 774 dz = zi(i,k) - zi(i,k-1) 775 ptem = 0.5(xlamue(i,k)+xlamue(i,k-1))-xlamud(i) 776 eta(i,k) = eta(i,k-1) (1 + ptem * dz)

Any suggestions where to go from here? Thanks,

SMoorthi-emc commented 3 years ago

This problem is related to a “huge” value sneaking in the fractional grid version, at least that was true in my case. I fixed it in my branch and I have no GFDL MP issue. I still have restart issue, so I am not pushing my PR yet. Moorthi

Sent from my iPhone

On Nov 11, 2020, at 12:16 PM, shansun6 notifications@github.com wrote:

Jun, thanks for your suggestion. I inserted a print statement after dz(i,k) = (phii(i,kk)-phii(i,kk+1))onebg in gfdl_cloud_microphys.F90: if (abs(dz(i,k))<1.e-12) write(,'(a,2i4,a,2es10.2,2(a,es10.2))') 'warning1 dz=0 at i,k=',i,k,' phii=',phii(i,kk),phii(i,kk+1),' dz=',dz(i,k)

Here is the output: phii went bad at many points and different k:

109: warning1 dz=0 at i,k= 8 1 phii= 2.55E+71 2.55E+71 dz= 0.00E+00 109: warning1 dz=0 at i,k= 10 1 phii= 1.16E+73 1.16E+73 dz= 0.00E+00 109: warning1 dz=0 at i,k= 14 1 phii= 1.98E+72 1.98E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 15 1 phii= 3.17E+72 3.17E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 8 2 phii= 2.55E+71 2.55E+71 dz= 0.00E+00 109: warning1 dz=0 at i,k= 10 2 phii= 1.16E+73 1.16E+73 dz= 0.00E+00 109: warning1 dz=0 at i,k= 14 2 phii= 1.98E+72 1.98E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 15 2 phii= 3.17E+72 3.17E+72 dz= 0.00E+00 109: warning1 dz=0 at i,k= 8 3 phii= 2.55E+71 2.55E+71 dz= 0.00E+00 109: warning1 dz=0 at i,k= 10 3 phii= 1.16E+73 1.16E+73 dz= 0.00E+00 . . .

see /scratch2/BMC/gsd-fv3-dev/Shan.Sun/FV3_RT/rt_73625/cpld_bmark_frac_prod/out

Also the error in "err" has switched to a different routine (it is no longer Line 778 of module_gfdl_cloud_microphys.F90):

139: forrtl: error (72): floating overflow 139: Image PC Routine Line Source 139: fv3.exe 000000000D25E6BF Unknown Unknown Unknown 139: libpthread-2.17.s 00002B7DB5698630 Unknown Unknown Unknown 139: fv3.exe 0000000005EE862F samfshalcnv_mp_sa 776 samfshalcnv.f

where Line 776 in samfshalcnv.f is the calculation of eta: 774 dz = zi(i,k) - zi(i,k-1) 775 ptem = 0.5(xlamue(i,k)+xlamue(i,k-1))-xlamud(i) 776 eta(i,k) = eta(i,k-1) (1 + ptem * dz)

Any suggestions where to go from here? Thanks,

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

ShanSunNOAA commented 3 years ago

Thank you, Moorthi, for your info. Should I try your ccpp-physics branch SM/SM_Oct102020, or is there one routine that I can cherry pick? I just want to make the "frac+gfdl" run to complete first. Please advice. Thanks!

SMoorthi-emc commented 3 years ago

I think my branch does have the fix; I don't remember precisely what line. You can try my branch to see if it works for you. I have been running with GFDL MP and it works fine. The only problem I have is that restart is not reproducing - still trying to chase but CCPP is not that easy (everything works fine under IPD). Moorthi

On Wed, Nov 11, 2020 at 4:49 PM shansun6 notifications@github.com wrote:

Thank you, Moorthi, for your info. Should I try your ccpp-physics branch SM/SM_Oct102020, or is there one routine that I can cherry pick? I just want to make the "frac+gfdl" run to complete first. Please advice. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725679879, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYXOBHCI6Q4XADNMHVDSPMBFTANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

SMoorthi-emc commented 3 years ago

Shan, My branch does things slightly different from what you are doing. I am using slmsk=1 if landfrac > 0.0. So, using my branch might cause some other issues for you. Moorthi

On Wed, Nov 11, 2020 at 7:04 PM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

I think my branch does have the fix; I don't remember precisely what line. You can try my branch to see if it works for you. I have been running with GFDL MP and it works fine. The only problem I have is that restart is not reproducing - still trying to chase but CCPP is not that easy (everything works fine under IPD). Moorthi

On Wed, Nov 11, 2020 at 4:49 PM shansun6 notifications@github.com wrote:

Thank you, Moorthi, for your info. Should I try your ccpp-physics branch SM/SM_Oct102020, or is there one routine that I can cherry pick? I just want to make the "frac+gfdl" run to complete first. Please advice. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725679879, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYXOBHCI6Q4XADNMHVDSPMBFTANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

junwang-noaa commented 3 years ago

I guess Moorthi might make some changes in get_phi_fv3_run or the gt0 passed to get_phi_fv3_run.

On Wed, Nov 11, 2020 at 7:50 PM SMoorthi-emc notifications@github.com wrote:

Shan, My branch does things slightly different from what you are doing. I am using slmsk=1 if landfrac > 0.0. So, using my branch might cause some other issues for you. Moorthi

On Wed, Nov 11, 2020 at 7:04 PM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

I think my branch does have the fix; I don't remember precisely what line. You can try my branch to see if it works for you. I have been running with GFDL MP and it works fine. The only problem I have is that restart is not reproducing - still trying to chase but CCPP is not that easy (everything works fine under IPD). Moorthi

On Wed, Nov 11, 2020 at 4:49 PM shansun6 notifications@github.com wrote:

Thank you, Moorthi, for your info. Should I try your ccpp-physics branch SM/SM_Oct102020, or is there one routine that I can cherry pick? I just want to make the "frac+gfdl" run to complete first. Please advice. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725679879 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYXOBHCI6Q4XADNMHVDSPMBFTANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725749549, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TIBV3QNU73VZK6ZWIDSPMWOBANCNFSM4TQH56ZA .

SMoorthi-emc commented 3 years ago

No, I did not. The cause of the problem is in merging with "frac_grid=.true." I think I made changes in sfc_composit routine. The problem is that surface T (and probably q) was becoming too large. Moorthi

On Wed, Nov 11, 2020 at 8:10 PM Jun Wang notifications@github.com wrote:

I guess Moorthi might make some changes in get_phi_fv3_run or the gt0 passed to get_phi_fv3_run.

On Wed, Nov 11, 2020 at 7:50 PM SMoorthi-emc notifications@github.com wrote:

Shan, My branch does things slightly different from what you are doing. I am using slmsk=1 if landfrac > 0.0. So, using my branch might cause some other issues for you. Moorthi

On Wed, Nov 11, 2020 at 7:04 PM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

I think my branch does have the fix; I don't remember precisely what line. You can try my branch to see if it works for you. I have been running with GFDL MP and it works fine. The only problem I have is that restart is not reproducing - still trying to chase but CCPP is not that easy (everything works fine under IPD). Moorthi

On Wed, Nov 11, 2020 at 4:49 PM shansun6 notifications@github.com wrote:

Thank you, Moorthi, for your info. Should I try your ccpp-physics branch SM/SM_Oct102020, or is there one routine that I can cherry pick? I just want to make the "frac+gfdl" run to complete first. Please advice. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725679879

,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALLVRYXOBHCI6Q4XADNMHVDSPMBFTANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725749549 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AI7D6TIBV3QNU73VZK6ZWIDSPMWOBANCNFSM4TQH56ZA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725756806, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYUQGU4OWQK2JDNN53LSPMYWPANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

ShanSunNOAA commented 3 years ago

Hi Moorthi and Jun,

Thanks for the information. I will keep digging.

Shan

On Wed, Nov 11, 2020 at 6:27 PM SMoorthi-emc notifications@github.com wrote:

No, I did not. The cause of the problem is in merging with "frac_grid=.true." I think I made changes in sfc_composit routine. The problem is that surface T (and probably q) was becoming too large. Moorthi

On Wed, Nov 11, 2020 at 8:10 PM Jun Wang notifications@github.com wrote:

I guess Moorthi might make some changes in get_phi_fv3_run or the gt0 passed to get_phi_fv3_run.

On Wed, Nov 11, 2020 at 7:50 PM SMoorthi-emc notifications@github.com wrote:

Shan, My branch does things slightly different from what you are doing. I am using slmsk=1 if landfrac > 0.0. So, using my branch might cause some other issues for you. Moorthi

On Wed, Nov 11, 2020 at 7:04 PM Shrinivas Moorthi - NOAA Federal < shrinivas.moorthi@noaa.gov> wrote:

I think my branch does have the fix; I don't remember precisely what line. You can try my branch to see if it works for you. I have been running with GFDL MP and it works fine. The only problem I have is that restart is not reproducing - still trying to chase but CCPP is not that easy (everything works fine under IPD). Moorthi

On Wed, Nov 11, 2020 at 4:49 PM shansun6 notifications@github.com wrote:

Thank you, Moorthi, for your info. Should I try your ccpp-physics branch SM/SM_Oct102020, or is there one routine that I can cherry pick? I just want to make the "frac+gfdl" run to complete first. Please advice. Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725679879

,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ALLVRYXOBHCI6Q4XADNMHVDSPMBFTANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725749549

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AI7D6TIBV3QNU73VZK6ZWIDSPMWOBANCNFSM4TQH56ZA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725756806 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYUQGU4OWQK2JDNN53LSPMYWPANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725768335, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVTVFJF7LKEVQKZKQ3DSPM2X7ANCNFSM4TQH56ZA .

ShanSunNOAA commented 3 years ago

It appears that the coupled model benchmark case can run successfully with the combination of "frac_grid=T and gfdl MP", with a minimum change of GFS_surface_composites.F90 & GFS_surface_composites.meta from Moorthi's ccpp-physics branch of SM_Oct102020!

Moorthi, thank you so much for your help! If you don't have plan to create a PR just for these two routines, may I do one for you, and what comments do you want to go with this PR? Thanks again.

SMoorthi-emc commented 3 years ago

Shan, My draft PR is already there, but I am trying to figure out the restart reproducibility with frac_grid=.true. in the standalone FV3. Can you reproduce? Moorthi

On Thu, Nov 12, 2020 at 1:47 AM shansun6 notifications@github.com wrote:

It appears that the coupled model benchmark case can run successfully with the combination of "frac_grid=T and gfdl MP", with a minimum change of GFS_surface_composites.F90 & GFS_surface_composites.meta from Moorthi's ccpp-physics branch of SM_Oct102020!

Moorthi, thank you so much for your help! If you don't have plan to create a PR just for these two routines, may I do one for you, and what comments do you want to go with this PR? Thanks again.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725876520, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYWFVV27XVXSZTRN6ODSPOAIVANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

ShanSunNOAA commented 3 years ago

I only ran 1 benchmark case with frac_grid=T, which finally completed 24hrs. I will try restart next. Thanks. -Shan

On Thu, Nov 12, 2020 at 5:37 AM SMoorthi-emc notifications@github.com wrote:

Shan, My draft PR is already there, but I am trying to figure out the restart reproducibility with frac_grid=.true. in the standalone FV3. Can you reproduce? Moorthi

On Thu, Nov 12, 2020 at 1:47 AM shansun6 notifications@github.com wrote:

It appears that the coupled model benchmark case can run successfully with the combination of "frac_grid=T and gfdl MP", with a minimum change of GFS_surface_composites.F90 & GFS_surface_composites.meta from Moorthi's ccpp-physics branch of SM_Oct102020!

Moorthi, thank you so much for your help! If you don't have plan to create a PR just for these two routines, may I do one for you, and what comments do you want to go with this PR? Thanks again.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-725876520 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYWFVV27XVXSZTRN6ODSPOAIVANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726051908, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVXV4G5VFPX4GWWYKATSPPJHBANCNFSM4TQH56ZA .

ShanSunNOAA commented 3 years ago

I ran GFS_surface_composites.F90/meta & sfc_sice.f/meta from SMoorthi-emc/ccpp-physics with the latest develop of ufs-weather-model in the coupled model set up by Denise, and the restart failed at a few lake points on tile3 and 1 lake point on tile2. Most of these lake points are covered with ice to begin with, and no ice left by the time of restart (hr12). I use slmks=floor(landfrac) consistently in the ICs and in FV3, thus these lake points have slmsk of either 0 or 2. However, it still cannot restart reproducibly. Will keep looking. Thanks.

SMoorthi-emc commented 3 years ago

Shan, There is another bug that I fixed some time ago that eliminates this error. Since people cherry picked from my code that fix did not make it to develop. Moorthi

Sent from my iPhone

On Nov 12, 2020, at 4:45 PM, shansun6 notifications@github.com wrote:

I ran GFS_surface_composites.F90/meta & sfc_sice.f/meta from SMoorthi-emc/ccpp-physics with the latest develop of ufs-weather-model in the coupled model set up by Denise, and the restart failed at a few lake points on tile3 and 1 lake point on tile2. Most of these lake points are covered with ice to begin with, and no ice left by the time of restart (hr12). I use slmks=floor(landfrac) consistently in the ICs and in FV3, thus these lake points have slmsk of either 0 or 2. However, it still cannot restart reproducibly. Will keep looking. Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ShanSunNOAA commented 3 years ago

Moorthi, thanks for your info. Let me see if I understand you correctly.

(1) I checked out your branch SM_Oct102020 of ccpp-physics. It has 6 files modified since Dom's commit of f3e6761 on Oct. 9:

physics/GFS_surface_composites.F90 physics/GFS_surface_composites.meta physics/micro_mg3_0.F90 physics/sfc_sice.f physics/sfc_sice.meta physics/GFS_surface_generic.F90

(2) I checked out your branch SM_Oct102020 of FV3. Changes in FV3GFS_io.F90 & GFS_typedefs.F90 seem unrelated to restart, and the rest are IPD related, since Dom's commit on Oct. 9. So I skipped this.

(3) I used these 6 files from (1) above to run with the develop of ufs-weather-model in the coupled mode, it won't reproduce after restart, and the difference remains to be on the icy lake points.

Any suggestions? Thanks, Shan

SMoorthi-emc commented 3 years ago

Shan, While some of my changes in FV3 side are IPD related as I moved some updates inside "#ifdef CCPP" to outside so that the code underneath are not limited to CCPP. By doing so, IPD has the same code as CCPP for the standard global physics. Having said that, you may have noticed that in the coupled model (fractional grid or not) the ice fraction over lakes is lost even if the initial condition has it. The reason is that there is a bug in atmos_model.F90, that I have fixed. This is not just a fractional grid issue or restart issue. I still have the restart issue with fractional grid and CCPP (I do not have this issue with IPD - fractional grid restart reproduces in IPD). Moorthi

On Fri, Nov 13, 2020 at 12:49 AM shansun6 notifications@github.com wrote:

Moorthi, thanks for your info. Let me see if I understand you correctly.

(1) I checked out your branch SM_Oct102020 of ccpp-physics. It has 6 files modified since Dom's commit of f3e6761 on Oct. 9:

physics/GFS_surface_composites.F90 physics/GFS_surface_composites.meta physics/micro_mg3_0.F90 physics/sfc_sice.f physics/sfc_sice.meta physics/GFS_surface_generic.F90

(2) I checked out your branch SM_Oct102020 of FV3. Changes in FV3GFS_io.F90 & GFS_typedefs.F90 seem unrelated to restart, and the rest are IPD related, since Dom's commit on Oct. 9. So I skipped this.

(3) I used these 6 files from (1) above to run with the develop of ufs-weather-model in the coupled mode, it won't reproduce after restart, and the difference remains to be on the icy lake points.

Any suggestions? Thanks, Shan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726536539, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYVPKU4E3LAW7RC54H3SPTCERANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

ShanSunNOAA commented 3 years ago

Thank Moorthi, for fixing the bug of setting lake ice to zero in atmos_model.F90. I still suspect this restart issue with fractional grid and CCPP has something to do with slmsk, as most of these failed points (over icy lake points) now failed in the nonfrac case before slmsk was updated as showed by Denise, except now we have 1 failed lake point on tile 2 which has no ice to begin with. But the chase after slmsk went nowhere. Need to find some new clues. Thanks, Shan

SMoorthi-emc commented 3 years ago

Shan, As I wrote, I still have issues with restart with frac_grid=T and CCPP. I don't have this issue with IPD. I am also debugging still - have not given up yet. I also am aware of other inconsistencies related to fractional grid - but that is for the future. Moorthi

On Fri, Nov 13, 2020 at 10:56 AM shansun6 notifications@github.com wrote:

Thank Moorthi, for fixing the bug of setting lake ice to zero in atmos_model.F90. I still suspect this restart issue with fractional grid and CCPP has something to do with slmsk, as most of these failed points (over icy lake points) now failed in the nonfrac case before slmsk was updated as showed by Denise, except now we have 1 failed lake point on tile 2 which has no ice to begin with. But the chase after slmsk went nowhere. Need to find some new clues. Thanks, Shan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726843267, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYXQ44GTYMANGNKFQNLSPVJKLANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

ShanSunNOAA commented 3 years ago

Hi Moorthi,

I found one problem for the restart with frac_grid=T: tsfco on those failed lake points is very low, much lower than freezing temperature. Upon restart, line 1350 of FV3GFS_io.F90 would reset its value to be "con_tice", which breaks the reproducibility.

1346 if(Model%frac_grid) then ! 3-way composite 1347 !$omp parallel do default(shared) private(nb, ix, tem, tem1) 1348 do nb = 1, Atm_block%nblks 1349 do ix = 1, Atm_block%blksz(nb) 1350 Sfcprop(nb)%tsfco(ix) = max(con_tice, Sfcprop(nb)%tsfco(ix))

I suggest we comment out this line highlighted yellow above for now. Once I did that, I was able to reproduce after restart. Let me know what you think.

Thanks, Shan

On Fri, Nov 13, 2020 at 9:16 AM SMoorthi-emc notifications@github.com wrote:

Shan, As I wrote, I still have issues with restart with frac_grid=T and CCPP. I don't have this issue with IPD. I am also debugging still - have not given up yet. I also am aware of other inconsistencies related to fractional grid - but that is for the future. Moorthi

On Fri, Nov 13, 2020 at 10:56 AM shansun6 notifications@github.com wrote:

Thank Moorthi, for fixing the bug of setting lake ice to zero in atmos_model.F90. I still suspect this restart issue with fractional grid and CCPP has something to do with slmsk, as most of these failed points (over icy lake points) now failed in the nonfrac case before slmsk was updated as showed by Denise, except now we have 1 failed lake point on tile 2 which has no ice to begin with. But the chase after slmsk went nowhere. Need to find some new clues. Thanks, Shan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726843267 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYXQ44GTYMANGNKFQNLSPVJKLANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726854278, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVS2ZH2GSLB6KDGLKX3SPVLVFANCNFSM4TQH56ZA .

SMoorthi-emc commented 3 years ago

Shan, I guess I have other changes. Yesterday I found that I can reproduce in REPRO=Y mode. Moorrhi

Sent from my iPhone

On Nov 14, 2020, at 1:19 AM, shansun6 notifications@github.com wrote:

Hi Moorthi,

I found one problem for the restart with frac_grid=T: tsfco on those failed lake points is very low, much lower than freezing temperature. Upon restart, line 1350 of FV3GFS_io.F90 would reset its value to be "con_tice", which breaks the reproducibility.

1346 if(Model%frac_grid) then ! 3-way composite 1347 !$omp parallel do default(shared) private(nb, ix, tem, tem1) 1348 do nb = 1, Atm_block%nblks 1349 do ix = 1, Atm_block%blksz(nb) 1350 Sfcprop(nb)%tsfco(ix) = max(con_tice, Sfcprop(nb)%tsfco(ix))

I suggest we comment out this line highlighted yellow above for now. Once I did that, I was able to reproduce after restart. Let me know what you think.

Thanks, Shan

On Fri, Nov 13, 2020 at 9:16 AM SMoorthi-emc notifications@github.com wrote:

Shan, As I wrote, I still have issues with restart with frac_grid=T and CCPP. I don't have this issue with IPD. I am also debugging still - have not given up yet. I also am aware of other inconsistencies related to fractional grid - but that is for the future. Moorthi

On Fri, Nov 13, 2020 at 10:56 AM shansun6 notifications@github.com wrote:

Thank Moorthi, for fixing the bug of setting lake ice to zero in atmos_model.F90. I still suspect this restart issue with fractional grid and CCPP has something to do with slmsk, as most of these failed points (over icy lake points) now failed in the nonfrac case before slmsk was updated as showed by Denise, except now we have 1 failed lake point on tile 2 which has no ice to begin with. But the chase after slmsk went nowhere. Need to find some new clues. Thanks, Shan

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726843267 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYXQ44GTYMANGNKFQNLSPVJKLANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-726854278, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVS2ZH2GSLB6KDGLKX3SPVLVFANCNFSM4TQH56ZA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

DeniseWorthen commented 3 years ago

Shouldn't we try to figure out why the tsfco temperatures are below freezing?

ShanSunNOAA commented 3 years ago

Denise, good point. These below-freezing water temperature occurred over lake points that started without ice. Without a lake model, no new lake ice can form at lake points with 100% open water, since sfc_sice.f will skip points without ice. Maybe gcycle can introduce ice at these cold lake points. How about setting water temperature not-below-freezing only at the initial time, and not at restart, to guaranteer restart reproducibility?

junwang-noaa commented 3 years ago

Shan, thanks for finding this issue! I think it is good to check 1) the initial value of tsfco on this lake point and confirm it is decreasing to below freezing during forecast time before restart. 2) later gcycle changes the open water temp and has ice information put on this point, otherwise we may still have issues on below freezing open water points.

On Sat, Nov 14, 2020 at 10:39 AM shansun6 notifications@github.com wrote:

Denise, good point. These below-freezing water temperature occurred over lake points that started without ice. Without a lake model, no new lake ice can form at lake points with 100% open water, since sfc_sice.f will skip points without ice. Maybe gcycle can introduce ice at these cold lake points. How about setting water temperature not-below-freezing only at the initial time, and not at restart, to guaranteer restart reproducibility?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727224723, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TOIUP3Q7ZQES4UAHP3SP2QCJANCNFSM4TQH56ZA .

SMoorthi-emc commented 3 years ago

I am adding a potential fix in gcycle.F90 as " if (slifcs(len) > 1.9_kind_phys) then Sfcprop(nb)%tsfco(ix) = con_tice endif" Moorthi

On Sat, Nov 14, 2020 at 8:55 PM Jun Wang notifications@github.com wrote:

Shan, thanks for finding this issue! I think it is good to check 1) the initial value of tsfco on this lake point and confirm it is decreasing to below freezing during forecast time before restart. 2) later gcycle changes the open water temp and has ice information put on this point, otherwise we may still have issues on below freezing open water points.

On Sat, Nov 14, 2020 at 10:39 AM shansun6 notifications@github.com wrote:

Denise, good point. These below-freezing water temperature occurred over lake points that started without ice. Without a lake model, no new lake ice can form at lake points with 100% open water, since sfc_sice.f will skip points without ice. Maybe gcycle can introduce ice at these cold lake points. How about setting water temperature not-below-freezing only at the initial time, and not at restart, to guaranteer restart reproducibility?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727224723 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AI7D6TOIUP3Q7ZQES4UAHP3SP2QCJANCNFSM4TQH56ZA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727294249, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYWCDJDNQ72OPQRBU23SP4YJVANCNFSM4TQH56ZA .

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

ShanSunNOAA commented 3 years ago

Moorthi, good that you added this potential fix. However, it won't help these three lake points on tile3 that failed restart reproducibility, since no points have ice to begin with or during the 24hrs.

[image: Screen Shot 2020-11-14 at 8.49.25 PM.png]

I have added a "if" to this statement in FV3GFS_io.F90, so that guard check will only occur during the initial time: if( Model%phour < 1.e-7) Sfcprop(nb)%tsfco(ix) = max(con_tice, Sfcprop(nb)%tsfco(ix)) If the model generates water temp far below freezing, it won't be reset to con_tice during restart, in order to be able to reproduce after restart. Make sense?

Thanks, Shan

On Sat, Nov 14, 2020 at 7:24 PM SMoorthi-emc notifications@github.com wrote:

I am adding a potential fix in gcycle.F90 as " if (slifcs(len) > 1.9_kind_phys) then Sfcprop(nb)%tsfco(ix) = con_tice endif" Moorthi

On Sat, Nov 14, 2020 at 8:55 PM Jun Wang notifications@github.com wrote:

Shan, thanks for finding this issue! I think it is good to check 1) the initial value of tsfco on this lake point and confirm it is decreasing to below freezing during forecast time before restart. 2) later gcycle changes the open water temp and has ice information put on this point, otherwise we may still have issues on below freezing open water points.

On Sat, Nov 14, 2020 at 10:39 AM shansun6 notifications@github.com wrote:

Denise, good point. These below-freezing water temperature occurred over lake points that started without ice. Without a lake model, no new lake ice can form at lake points with 100% open water, since sfc_sice.f will skip points without ice. Maybe gcycle can introduce ice at these cold lake points. How about setting water temperature not-below-freezing only at the initial time, and not at restart, to guaranteer restart reproducibility?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727224723

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AI7D6TOIUP3Q7ZQES4UAHP3SP2QCJANCNFSM4TQH56ZA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727294249 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYWCDJDNQ72OPQRBU23SP4YJVANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727297249, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVXK4RORR3ENWASMMLDSP43WHANCNFSM4TQH56ZA .

SMoorthi-emc commented 3 years ago

Well, part of the problem here is that sfccycle is yet to be fixed for the fractional grid. Moorthi

Sent from my iPhone

On Nov 14, 2020, at 11:55 PM, shansun6 notifications@github.com wrote:

Moorthi, good that you added this potential fix. However, it won't help these three lake points on tile3 that failed restart reproducibility, since no points have ice to begin with or during the 24hrs.

[image: Screen Shot 2020-11-14 at 8.49.25 PM.png]

I have added a "if" to this statement in FV3GFS_io.F90, so that guard check will only occur during the initial time: if( Model%phour < 1.e-7) Sfcprop(nb)%tsfco(ix) = max(con_tice, Sfcprop(nb)%tsfco(ix)) If the model generates water temp far below freezing, it won't be reset to con_tice during restart, in order to be able to reproduce after restart. Make sense?

Thanks, Shan

On Sat, Nov 14, 2020 at 7:24 PM SMoorthi-emc notifications@github.com wrote:

I am adding a potential fix in gcycle.F90 as " if (slifcs(len) > 1.9_kind_phys) then Sfcprop(nb)%tsfco(ix) = con_tice endif" Moorthi

On Sat, Nov 14, 2020 at 8:55 PM Jun Wang notifications@github.com wrote:

Shan, thanks for finding this issue! I think it is good to check 1) the initial value of tsfco on this lake point and confirm it is decreasing to below freezing during forecast time before restart. 2) later gcycle changes the open water temp and has ice information put on this point, otherwise we may still have issues on below freezing open water points.

On Sat, Nov 14, 2020 at 10:39 AM shansun6 notifications@github.com wrote:

Denise, good point. These below-freezing water temperature occurred over lake points that started without ice. Without a lake model, no new lake ice can form at lake points with 100% open water, since sfc_sice.f will skip points without ice. Maybe gcycle can introduce ice at these cold lake points. How about setting water temperature not-below-freezing only at the initial time, and not at restart, to guaranteer restart reproducibility?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727224723

, or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AI7D6TOIUP3Q7ZQES4UAHP3SP2QCJANCNFSM4TQH56ZA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727294249 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ALLVRYWCDJDNQ72OPQRBU23SP4YJVANCNFSM4TQH56ZA

.

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: Shrinivas.Moorthi@noaa.gov Phone: (301) 683-3718 Fax: (301) 683-3718

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/268#issuecomment-727297249, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALORMVXK4RORR3ENWASMMLDSP43WHANCNFSM4TQH56ZA .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

ShanSunNOAA commented 3 years ago

This issue is resolved in the commit today. Thanks Moorthi for fixing this bug.