ufs-community / ufs-weather-model

UFS Weather Model
Other
136 stars 244 forks source link

aerosol fields do not reproduce when fhmax=4,fhzero=2 #1190

Open DeniseWorthen opened 2 years ago

DeniseWorthen commented 2 years ago

Description

To reduce the time required by the updated cpld_bmark_p8 test with the mesh cap for PR https://github.com/ufs-community/ufs-weather-model/pull/1131, I've tried to reduce fhmax to 4 and restart the model from hour 2.

All files reproduce except for the atmf004.tile[1-6].nc and fv_tracer.res.tile[1-6].nc restart files. These files differ only in the following fields: nh3, nh4a, no3an2, no3an2, no3an3, pm25, pm10.

To Reproduce:

A test branch using the current cpld_control_c96_p8 modified to run for fhmax=4 is here: branch. This test produces same field differences as those in the updated cpld_bmark_p8 test.

The control and restart cases in the test branch can be run using the oRT command:

./opnReqTest -n cpld_control_c96_p8 -c rst -ek

This will use ecflow and keep the run directory.

junwang-noaa commented 2 years ago

@weiyuan-jiang May I ask if there is any restriction on the restart intervals for the species of nh and no3?

bbakernoaa commented 2 years ago

@DeniseWorthen Does this include the updated compiler flags. I can't find that issue/pr right now but I believe that @rmontuoro had fixed this issue.

DeniseWorthen commented 2 years ago

I can get restart reproducibility with the current configuration which uses fhmax in intervals of 6 (depending on the test). It is when reducing the fhmax to 4 (and fhzero to either 1 or 2) that the aerosol fields are not reproducing.

JessicaMeixner-NOAA commented 2 years ago

If you use Dusan's PR: https://github.com/ufs-community/ufs-weather-model/pull/1171 does it help? I believe that's the issue/fix Barry is referring to.

weiyuan-jiang commented 2 years ago

@weiyuan-jiang May I ask if there is any restriction on the restart intervals for the species of nh and no3?

Sorry I cannot answer the question. But I can ask around for you

DeniseWorthen commented 2 years ago

Thanks @JessicaMeixner-NOAA, I understood which fix Barry was referring to.

I can test Dusan's compile options. However, since aerosols reproduce using fhmax=6,fhzero=6, I would be surprised if that explains why it is not reproducing at fhmax=4, fhzero=2.

DeniseWorthen commented 2 years ago

I tested using oRT after merging Dusan's release_flags branch and obtained the same non-reproducing aerosol fields.

DeniseWorthen commented 2 years ago

I've updated the test branch to try a 3/1/4 restart test. The oRT enforces the restart time at FHMAX/2 so testing of the 3/1/4 cannot be done w/ the oRT. Also, because of Issue MOM6 Issue #90, comparison of MOM6 restarts will need to be removed if otherwise the 3/1/4 test reproduces.

mathomp4 commented 2 years ago

Query for someone in GEOS-land, do you have a "descriptive" explanation for the variables in play here? I've never run UFS so I'm a bit in the dark. 😄 I'm sort of guessing they are like our DT (time steps)?

weiyuan-jiang commented 2 years ago

Are the restart files the only input files? Are there any Extdata in the tests? @junwang-noaa

junwang-noaa commented 2 years ago

@mathomp4 The test case is a C96 global forecast coupled case. The time step for atmosphere is 720s, it does not change in the control (fh0->4hr from a cold start) and the restart test(fh0->2 cold start, then fh2->4 with restrart). In the restart test, the forecast restarts from current time at fh=2 using the restart files and continue to run 2 hrs to get fh=4hr.

DeniseWorthen commented 2 years ago

@mathomp4 The fhmax is the forecast length. In this case, we are running the model forward 4 hours and writing a restart for the components at hour 2. Using the restarts at hour2, the model is run from hour=2 to hour=4. What I comparing are the FV3 tracer restart files and the model forecast files between the initial (hr 0:4) and the restart run (2:4).

I can get the aerosol fields to reproduce if I do the same test using a restart at hour 3. In this case I'm still running the model 4 hours but I'm using a restart from hour 3 to restart to run the final 1 hour.

fhzero is the interval when accumulated fields are re-zeroed. I've actually tested w/ both fhzero=1 and 2, so I think it is not really a fhzero issue.

SMoorthi-emc commented 2 years ago

I think fhzero should not be lower than fhout.

On Mon, May 2, 2022 at 8:44 AM Denise Worthen @.***> wrote:

@mathomp4 https://github.com/mathomp4 The fhmax is the forecast length. In this case, we are running the model forward 4 hours and writing a restart for the components at hour 2. Using the restarts at hour2, the model is run from hour=2 to hour=4. What I comparing are the FV3 tracer restart files and the model forecast files between the initial (hr 0:4) and the restart run (2:4).

I can get the aerosol fields to reproduce if I do the same test using a restart at hour 3. In this case I'm still running the model 4 hours but I'm using a restart from hour 3 to restart to run the final 1 hour.

fhzero is the interval when accumulated fields are re-zeroed. I've actually tested w/ both fhzero=1 and 2, so I think it is not really a fhzero issue.

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1190#issuecomment-1114804392, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLVRYVJT5GP5U3AKR4DYFDVH7E3JANCNFSM5UKDGOPA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Dr. Shrinivas Moorthi Research Meteorologist Modeling and Data Assimilation Branch Environmental Modeling Center / National Centers for Environmental Prediction 5830 University Research Court - (W/NP23), College Park MD 20740 USA Tel: (301)683-3718

e-mail: @.*** Phone: (301) 683-3718 Fax: (301) 683-3718

DeniseWorthen commented 2 years ago

Thanks @SMoorthi-emc. I think I did have fhout set to either 2 (for fhzero=2) or 1 (for fhzero=1) but I will recheck.

@weiyuan-jiang I'm not sure how to answer your question. I have a run directory on hera here

/scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_OPNREQ_TEST/opnReqTest_14673/cpld_control_c96_p8_std_base
mathomp4 commented 2 years ago

Okay. I can confirm this on the GEOS end it seems. I ran a start-stop run of 4 hours vs 2+2 and I'm getting restart failures as well. I guess my nightly tests never picked up on this because my 'default' regression start-stop test is 24 vs 18+6...and there's a lot of 3s in that.

I've pinged @bena-nasa about this as well as @weiyuan-jiang and @tclune from our group knowing this.

bena-nasa commented 2 years ago

Hi All,
there appears to be a hard coded 3 hourly frequency here https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/NI2G_GridComp/NI2G_GridCompMod.F90#L393 and here: https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/SU2G_GridComp/SU2G_GridCompMod.F90#L480

in gocart2g

If I changed this to a 2 hour frequency then a run of 4 hours vs 2 + 2 passes our start-stop regress. So this just seems suspicious and could explain why something involving 2 hours is misbehaving (just speculation for UFS since I can't test but certainly explains why our own regression failed in the run length was not a multiple of 3 hours). Seems like this needs to be an even interval of the run segment length or perhaps something needs to be saved in a checkpoint that is not happening and the logic for this needs to be tightened. I'll open an issue in the gocart repository.

weiyuan-jiang commented 2 years ago

Here is Arlindo's comments. Quote: " There was a reason why the 3 hour alarm was hardwired, as not to give the user the illusion that they could specify any other value. An easier solution may involve changing the way we handle these oxidants. The way this oxidant is "recycled" always apperead contrived in my opinion. So, stop trying to find a way to address this in code. There is no deep mandate to keep this algorithm. Let us discuss this in our aerosol group meeting."

tclune commented 2 years ago

@weiyuan-jiang when did he make that comment?

bena-nasa commented 2 years ago

https://github.com/GEOS-ESM/GOCART/issues/146#issuecomment-1115201582

junwang-noaa commented 2 years ago

@bena-nasa @weiyuan-jiang May I ask if there is any update on this issue? Thanks

weiyuan-jiang commented 2 years ago

I am not aware of any update on this issue. @junwang-noaa

bena-nasa commented 2 years ago

@junwang-noaa Our best thought is that the issue is this 3 hourly frequency hard coded in gocart (see the issue linked above in the gocart repo). I think the issue is two-fold, the alarm needs to be created with a fixed reference time and an extra field needs to be in the checkpoint file. Unfortunately I was having some misbehaviour with the ESMF alarms when I tried to fix this. In that issue Arlindo commented that perhaps that algorithm itself needs changed altogether but I have not heard anything more on that. I was on vacation the last several days. I can give a 2nd look at fixing the current algorithm as is, maybe my first attempt I did something wrong.

junwang-noaa commented 2 years ago

A related issue #1207 was created to allow model to restart at fh=3hr and write out restart files at the end of forecast time fh=4.

junwang-noaa commented 1 year ago

I am curious how the PR#1171 is related to this restart reproducibility as we currently have the restart reproducibility when using the 3hr restart interval.

On Wed, Apr 27, 2022 at 12:32 PM Jessica Meixner @.***> wrote:

If you use Dusan's PR: #1171 https://github.com/ufs-community/ufs-weather-model/pull/1171 does it help? I believe that's the issue/fix Barry is referring to.

— Reply to this email directly, view it on GitHub https://github.com/ufs-community/ufs-weather-model/issues/1190#issuecomment-1111211706, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TJZWR5XL6P5SESSDDDVHFTX5ANCNFSM5UKDGOPA . You are receiving this because you commented.Message ID: @.***>

junwang-noaa commented 1 year ago

Since 3 hourly frequency hard coded in gocart is hardcoded. Some code changes are required in GOCART side to allow this capability. I will close the issue at this time.

mathomp4 commented 1 year ago

Since 3 hourly frequency hard coded in gocart is hardcoded. Some code changes are required in GOCART side to allow this capability. I will close the issue at this time.

@junwang-noaa I think this was fixed by @bena-nasa in https://github.com/GEOS-ESM/GOCART/pull/224 (or at least partially)? This PR got into GOCART v2.2.0

junwang-noaa commented 1 year ago

@mathomp4 That is great! Currently we have a PR with GOCART pointing to develop branch on 5/4 ("Ensure GOCART2G can run without the NI component"). Do we need to make additional changes in GOCART configurations when switching to GICART v2.2.0?

mathomp4 commented 1 year ago

@junwang-noaa what hash are you pointing to? I can look and what's different.

Also, I suppose I'd say use v2.2.1 as that has a bug fix on 2.2.0.

junwang-noaa commented 1 year ago

It is this version.

mathomp4 commented 1 year ago

Okay. So v2.1.4 essentially. I think you should be able to go to v2.2.1 without any big issues that I can see (famous last words).

junwang-noaa commented 1 year ago

Thanks for checking. I will update and test, will let you know if I run into any issues.

zach1221 commented 7 months ago

Thanks for checking. I will update and test, will let you know if I run into any issues.

Hi, @junwang-noaa . Can this issue be closed or is there further work required here?

junwang-noaa commented 7 months ago

I don't have a chance to finalize it. Will EPIC test it?

zach1221 commented 7 months ago

I don't have a chance to finalize it. Will EPIC test it?

Yes, I can test.