In coupled configs, SW exported by ATM has odd time-dependence

DeniseWorthen commented 1 year ago

Description

The SW direct components exported by FV3 have odd "blips" over the course of a day. This is shown by setting mediator history files to write the instantaneous fields every time through the coupling loop.

Note that "instantaneous" in this case refers to the type of mediator history file that CMEPS is writing out and not the type of field (instantaneous or mean) field that ATM is providing.

For the C96 case on Tile 1, at i=50,j=70 (6E, 0.4N) the visible direct SW component is shown below (near ir shows similar). Every instantaneous value is shown, with values on the hours called out.

The visible diffuse component shows

I do not believe this is due to any cloud interaction, since the blips occur on the hour and are nearly symmetric during the diurnal cycle.

To Reproduce:

Mediator history files for the atmosphere can be added using the following in MED_attributes in nems.configure:

      history_tile_atm = 96
      history_n_atm_inst = 1
      history_option_atm_inst = nsteps

Files are produced for every coupling timestep and can be cat'd together to form a timeseries using nco. The SW fields imported by the mediator are named

near-ir diffuse: atmImp_Faxa_swndf visible diffuse: atmImp_Faxa_swvdf near-ir direct: atmImp_Faxa_swndr visible direct: atmImp_Faxa_swvdr

Additional context

Output

yangfanglin commented 1 year ago

@DeniseWorthen Denise, are these visible direct and diffuse downward SW fluxes at the surface ? Would be interesting to see if the current operational model (GFSv16) has the same issue. @Qingfu-Liu @dustinswales

DeniseWorthen commented 1 year ago

Yes, these are down, not net sw components (or they should be). They are in terms of the export fields

  ! MEAN sfc downward nir direct flux (W/m**2)
  case ('mean_down_sw_ir_dir_flx')
   call block_data_copy(datar82d, GFS_data(nb)%coupling%dnirbm_cpl, Atm_block, nb, scale_factor=rtime, rc=localrc)

I did a test swapping these with the "instantaneous" values GFS_data(nb)%coupling%dnirbmi_cpl and removing rtime and got an identical result.

yangfanglin commented 1 year ago

@DeniseWorthen I understand solar zenith is not a standard output, would it be possible to write out solar zenith angle at each time step to a text file and inspect if there is any odd features. A short run for 24 hours should suffice. I understand you are busy. Let me know if this will take too much of your time. I can ask physics developers to check. Thanks for uncovering this "bug".

DeniseWorthen commented 1 year ago

@yangfanglin If I knew how to translate the i,j location on a tile to the loop in the code where it is calculating cosz then I could do that. I think even just the single location above would show if that is the issue. Is that easy to do?

yangfanglin commented 1 year ago

@DeniseWorthen I think you can pick a random single point (i,j) in the cosz loop and print out the values, assuming you will have the luck of not picking a polar night point.

junwang-noaa commented 1 year ago

l think we have looked the zenith angle issue with the inst sw radiation and fixed it in GFSv16. I checked the code updates on zenith angle, this one comes out:

https://github.com/ufs-community/ccpp-physics/commit/7644b0c46f17c97a865b5e4231a29970d6bbd8ae#diff-18710705427e7b03b02adfa6bc3969fa8a5484bd18d6c08192fe7001e489f885R890

@grantfirl Could this be related to the zenith angle change Denise showed?

yangfanglin commented 1 year ago

@junwang-noaa See slide 5 of this ppt for the bug fix we made to IPD physics in 2019. The pattern looks to be slightly different from what Denise showed.

The change made to https://github.com/ufs-community/ccpp-physics/commit/7644b0c46f17c97a865b5e4231a29970d6bbd8ae#diff-18710705427e7b03b02adfa6bc3969fa8a5484bd18d6c08192fe7001e489f885R890 does beg for questions. coszen=0 is physical.

DeniseWorthen commented 1 year ago

That figure looks to be the combined diffuse + direct, so to me it looks like that is the basic same pattern.

junwang-noaa commented 1 year ago

Thanks, Fanglin. What I mean is that the bug fix in GFSv16 resolved the zigzag pattern shown in sw flux. The fixes in GFSv16 are in the current develop branch. I am trying to find changes coming to the model since GFSv16 that may cause the issue in the develop branch now.

DeniseWorthen commented 1 year ago

I was able to produce a cosz plot using a trick to substitute the value in the dnirbm_cpl array

diff --git a/physics/GFS_surface_generic_post.F90 b/physics/GFS_surface_generic_post.F90
index 76d3f570..c7a732cd 100644
--- a/physics/GFS_surface_generic_post.F90
+++ b/physics/GFS_surface_generic_post.F90
@@ -143,7 +143,12 @@
             dnirdfi_cpl (i) = adjnirdfd(i)
             dvisbmi_cpl (i) = adjvisbmd(i)
             dvisdfi_cpl (i) = adjvisdfd(i)
-            dnirbm_cpl  (i) = dnirbm_cpl(i) + adjnirbmd(i)*dtf
+            !dnirbm_cpl  (i) = dnirbm_cpl(i) + adjnirbmd(i)*dtf
+!            if (wet(i)) then                    ! some open water
+!  ---  compute open water albedo
+              xcosz_loc = max( zero, min( one, xcosz(i) ))
+              dnirbm_cpl  (i) = xcosz_loc
+!            endif

I'm not sure of the units, but a plot of the values makes sense in terms of shape for it to be cosz in the export field (I mulitiplied by 1e3 in this plot)

Plotting this field as a time series at the same i=50,j=70 on tile 1 gives

So I don't think it is a cosz issue.

Also, the original point I plotted on tile1 turns out to be over land. But an adjacent point on the ocean (i=48,j=70) shows the same zig-zag effect

junwang-noaa commented 1 year ago

@DeniseWorthen Is the zenith angle plot from every time step? The sw is called every hour, and the radiation heating is then adjusted at every time step with the zenith angle. From the first two plots in the issue, it looks to me the sw radiation flux values coming out of sw radiation are correct, but the value interpolated to each time step is not. Can you print out each time step value and the hourly values? Also can you make a run without the change in:

https://github.com/ufs-community/ccpp-physics/commit/7644b0c46f17c97a865b5e4231a29970d6bbd8ae#diff-18710705427e7b03b02adfa6bc3969fa8a5484bd18d6c08192fe7001e489f885R890

to see if there is any impact on xcosz?

DeniseWorthen commented 1 year ago

@junwang-noaa This is the cosz every time the ATM passes fields to the mediator. In the C96 configuration, that is every 720s.

I wasn't sure if radiation was calculated every hour or every 30mins, but I also had the same thought. The actual calculation of the radiation on the hour appears correct, but the interpolation to each time step is not.

I'll try to make the test you suggest.

junwang-noaa commented 1 year ago

I just checked the code, the change will impact coszen (average of cosine of zenith angle over daytime shortwave call time interval), not the xcosz, but it will impact the interpolation factor xmu. So you may want to plot a SW field such as the visible direct SW to see the impact.

DeniseWorthen commented 1 year ago

@junwang-noaa I made the following change to radiation_astronomy.f

!  --- ...  compute time averages

      do i = 1, IM
        coszdg(i) = coszen(i) * rstp
        if (istsun(i) > 0) coszen(i) = coszen(i) / istsun(i)
        !if (istsun(i) > 0 .and. coszen(i) /= 0.0_kind_phys) then
        !  coszen(i) = coszen(i) / istsun(i)
        !endif
      enddo

But I still see the same effect as on the original plots (both direct and diffuse)

junwang-noaa commented 1 year ago

The plots you show are from C96 or C384?

DeniseWorthen commented 1 year ago

C96.

Qingfu-Liu commented 1 year ago

I am not sure following test can fix the problem, but I think you can try it. This problem you showed might only impact this low resolution run. I have the following reasoning: In the code dcyc2t3.f line 285-286: tem1 = fhswr / deltim nstp = max(6, nint(tem1)) In this low resolution run: tem1 = fhswr / deltim =3600/720=5 If nint(tem1) is less than 6, the xcosz(i) may not calculate correctly. You can change: nstp = max(5, nint(tem1)) Try to run it again. Not sure if this can fix this problem.

DeniseWorthen commented 1 year ago

@Qingfu-Liu Would it also work to switch dt_atmos=600 to test your dcyc2t3 change? I can do at run time w/o recompile.

Qingfu-Liu commented 1 year ago

Yes. It should work

yangfanglin commented 1 year ago

@Qingfu-Liu what happens if dt_atmos is 1800s ?

Qingfu-Liu commented 1 year ago

@yangfanglin I looked the code again, I am not sure if the calculations of nstp and nstl cause this problem. But the problem related to the calculation of xcosz(i) and coszen(i)

junwang-noaa commented 1 year ago

I was looking at the same code as Qingfu. I think the problem is coszen, which uses at least the 6 substeps within a SW call interval.

DeniseWorthen commented 1 year ago

Setting dt_atmos=600 (and also dt for ice and updating the coupling freq) results in the same odd pattern of both diffuse and direct. Diffuse N-IR shown here:

Qingfu-Liu commented 1 year ago

In our case, nstl=1, so only the first part of the code is used: if (nstl == 1) then cns = pid12 (solhr + deltimf7200 - hour12) + slag do i = 1, IM xcosz(i) = sdecsinlat(i) + cdeccoslat(i)*cos(cns+xlon(i)) enddo elseif (nstl == nstp) then

Qingfu-Liu commented 1 year ago

I am curious why the visible direct SW component and visible diffuse SW component show two different patterns? They are supposed to be multipled by the same weight

DeniseWorthen commented 1 year ago

@Qingfu-Liu That's a good question. I've shown just the visible, but the n-ir show the same difference between dir and dif.

Qingfu-Liu commented 1 year ago

So I think the problem may not come from the weighting xmu(i). I searched two variables sfcnirdfd and sfcvisbmd, there are no other physics programs using them (except in program dcyc2t3.f). Those two variables are carried around from other part of the code, and might be changed between the dynamic steps.

junwang-noaa commented 1 year ago

Do we see the same issue on higher resolution test? Does this also show up in model history tile file field such as sfc down sw?

Qingfu-Liu commented 1 year ago

GFS_surface_generic_post.F90

@DeniseWorthen You are running a couple test (the code you changed need to set the parameter cplflx=.true.)? Can you run an uncoupled test (set the parameter cplflx=.false.) to see if we still have the same problem? (I am still learning how to output those results you showed here)

DeniseWorthen commented 1 year ago

@Qingfu-Liu I have been working with @junwang-noaa off-line and I have a control_p8 test which shows the behavior in the sfc files.

The run is /scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_RT/rt_31222/cp8tiled. In this case, I've made a couple of changes from default:

write_dopost:            .false.
output_grid:             'cubed_sphere_grid'
output_fh:               0.2 -1

I've catted the tile4 files together using NCO: ncrcat sfcf*.tile4.nc sfc.tile4.nc

At i=49,j=90 (209.8E, -0.4S) I can plot the tcdc_aveclm along w/ some of the SW components. I add 100 to the tcdc_aveclm to make it show up on the plot

Just looking at the diffuse visible component and marking each hour:

Qingfu-Liu commented 1 year ago

@DeniseWorthen Thank you very much. Great to know about this

yangfanglin commented 1 year ago

@DeniseWorthen @junwang-noaa In the latest plot, is the purple line instantaneous total SW flux and the others time averaged ? How to read this plot ? Does it mean the instantaneous SW is fine, and the time averaged fluxes have issues ? Since there is a bucket for flux accumulation, conversions are needed to get correct "mean" flux for a given time period.

Can we also conclude C384 does not have the zigzag issue found instantaneous fluxes shown in C96 runs ?

DeniseWorthen commented 1 year ago

I have not run the c384 case, but I can try with either the benchmark or the control_c384 test. I'm not sure if the control_c384 test is using the same physics schemes as the p8 tests.

If I interpret the surface field in the file dwsrf as the instantaneous, then I think the blue (purple?) line shows that it does not have the issue, and the _ave fields do. I don't think we have component instantaneous values in the sfc files, but since the total is just the sum of the 4, I think we'd see it in dswrf if they were there.

dustinswales commented 1 year ago

@DeniseWorthen I haven't had a chance to take a deep look at this, but it sure seems like that there is something wrong with bucket variables, since the instantaneous sw fluxes are as expected. These are accumulated in GFS_rrtmg_post.F90, using the instantaneous values. I will trace the interstitial with the radiation diag fluxes and try to see where things are going wrong.

junwang-noaa commented 1 year ago

@yangfanglin I want to add that we are looking at the clear sky sw (TCDC=0), so my understanding is that we expect the mean field to increase in the morning. Also, we have fhzero=6, the averaged field (DSWRF_AVE) does not look like the average of inst DSWRF at time steps within fh: 18-19,

Qingfu-Liu commented 1 year ago

I think I understand what cause the "blips" in the sw dnward flux. The radiation scheme calculated the radiation flux every hours (fhswr=3600s) in my test, and is interpolated to the dynamic time step (deltim=720s). The 'interpolation" is in fact "extrapolation" which has large errors. I made small changes to the following code (dcyc2t3.f and rrtmg_sw_post.F90 and their associated meta files) so that I can print out the data at particular time and location (5.9E, 0.5N). From the printout, there is no changes in the hourly value. The SW Downward Flux from the radiation code at every hour is correctly passed to the code dcyc2t3.f, but the extrapolation produce "blips" in the SW downwars flux at the dynamic time steps.
I collect some data from the printout, and plotted the variable "adjvisdfd". Here is the results:

Book1.pdf

I want to clarify that the test is an uncoupled run. I will run a coupled case late

junwang-noaa commented 1 year ago

@Qingfu-Liu can you also print the xmu(i) in dcyc2t3.f to the confirm it's the extrapolation that causes the issue. A related question is that why dswrf does not have the "blips" while it is also interpolated using the same weights?

You can also try to set dt_atms:600 and output_fh:0.166667 -1 to test with interpolation.

Qingfu-Liu commented 1 year ago

@junwang-noaa I did printout the xmu(i), and I think the xmu(i) has problem. I will rerun the tests using 600s dynamic time step

Qingfu-Liu commented 1 year ago

@junwang-noaa Your question: "A related question is that why dswrf does not have the "blips" while it is also interpolated using the same weights?" I only printout the values inside the code: dcyc2t3.f and rrtmg_sw_post.F90, not sure what happens after the variables outside the code dcyc2t3.f

Qingfu-Liu commented 1 year ago

I rerun the uncoupled case using dt_atmos=600s, the results are similar to the run with dt_atmos=720s. I also run the coupled case with dt_atmos=720s, conclusion is the same. But I can't run the coupled case with dt_atmos=600s. The run failed

DeniseWorthen commented 1 year ago

@Qingfu-Liu I've wondered how the interpolation gets done. You are calling it extrapolation. I had assumed that what was happening is that the swrad at the hour would be scaled somehow by the solar zenith angle change. So at +720s, the cosz would be found and the swrad at the previous hour would be scaled (somehow) using some ratio of the cosz at the two times. Is that how it works?

DeniseWorthen commented 1 year ago

I rerun the uncoupled case using dt_atmos=600s, the results are similar to the run with dt_atmos=700s. I also run the coupled case with dt_atmos=700s, conclusion is the same. But I can't run the coupled case with dt_atmos=600s. The run failed

Since we see the issue in the standalone, I'm not sure you need to try the coupled case. There are two places you'd need to change the timestep (ice_in and nems.configure) in addition to the atm changes.

Qingfu-Liu commented 1 year ago

@Qingfu-Liu I've wondered how the interpolation gets done. You are calling it extrapolation. I had assumed that what was happening is that the swrad at the hour would be scaled somehow by the solar zenith angle change. So at +720s, the cosz would be found and the swrad at the previous hour would be scaled (somehow) using some ratio of the cosz at the two times. Is that how it works?

Yes, that is how it works. For example, the xmu(i) has the following values between solhr=7h and solhr=8h

solhr=7.0h, xmu(i)=0.7887 solhr=7.2h, xmu(i)=0.8970 solhr=7.4h, xmu(i)=1.0028 solhr=7.6h, xmu(i)=1.1058 solhr=7.8h, xmu(i)=1.2059 solhr=8.0h, xmu(i)=0.8796

So at each radiation time step, the radiation flux passed to the coupling is actually reduced when sun is rising.

junwang-noaa commented 1 year ago

@Qingfu-Liu Is this problematic? Since the radiation is called at every hour, is it correct we should have weight 1.0 at solhr=7hr/8hr? Should the weights be always >1.0?

Qingfu-Liu commented 1 year ago

@junwang-noaa I thought about this, but I do not have a better extrapolation method.

Qingfu-Liu commented 1 year ago

@junwang-noaa @DeniseWorthen I do not have a better extrapolation method to calculate the downward shortwave fluxes at each dynamic time step. In the current extrapolation method, there is a constraint that "sum adjvisdfd (for all time steps)=int(fhswr / deltim) sfcvisdfd", this is better than keeping adjvisdfd to be a constant (which has the same constraint). In the current extrapolation, sum the donwward SW flux over dynamic time step, we have (note that i is horizontal dimension), sum (adjvisdfd(i) ) = sum (sfcvisdfd(i) xmu(i) ) = sfcvisdfd(i) sum (xmu(i)) = sfcvisdfd(i) int(fhswr / deltim) In the second method that we can keep xmu(i)=1 which is worse than the current method. In order to reduce the "blips", the radiation time step can be reduced. Anyway, to improve the radiation calculation at each dynamic steps, more research is needed.

Qingfu-Liu commented 1 year ago

@junwang-noaa @DeniseWorthen I searched online and found a paper: Manners, J., Thelen, J.-C., Petch, J., Hill, P. and Edwards, J. M. (2009). Two fast radiative transfer methods to improve the temporal sampling of clouds in numerical weather prediction and climate models. Q. J. R. Meteorol. Soc., 135, 457–468. The paper described a method to correct the downward SW surface flux for UK MET model. I have not read it in detail yet. I have an upcoming trip to Denver from June 9-18. I will read the paper in detail once I have time

yangfanglin commented 1 year ago

@Qingfu-Liu @junwang-noaa @DeniseWorthen I am a bit lost now. Is there a bug or is the current solution not accurate ? If it is a bug, we need to fix it the sooner the better. If it is not accurate, we can take the time to improve it. @dustinswales Dustin, when you get a chance please take a look at this issue as well. Thanks.

DeniseWorthen commented 1 year ago

I was also looking at this paper which seems somewhat related (?) since it is talking about how to estimate cosz between radiation timesteps.

Qingfu-Liu commented 1 year ago

@yangfanglin This is not a bug, the method we currently using is not accurate

ufs-community / ufs-weather-model