ESCOMP / CESM

The Community Earth System Model
http://www.cesm.ucar.edu/
Other
348 stars 198 forks source link

Advice wanted: DATM variables needed to force all other components #71

Closed kdraeder closed 6 years ago

kdraeder commented 6 years ago

As part of an NSC proposal to generate CAM6 ensemble re-analysis forcing for all of the other CESM2 components, I need to figure out which variables should be written by the CAM assimilation into cpl history files. This will be a 10-20 year series, saving every 3 hours, so I should pare the list to the minimum, but complete, set. I need to run some tests before writing the proposal, which is due in mid-September, so I need to figure this out soon, in order to calculate the data volume.

My understanding of the DATM mode in CESM2 is that each component might want a distinct set of variables, parcelled out into 1 or more streams. The variables in the streams might be derived from variables in the cpl history files. We won't calculate those, but want to provide the input variables.

The 2 degree, CAM4, forcing files we've been using for CLM, POP, and CICE have a minimal set of variables a2xFaxa{rain,snow,lw,sw}* a2xSa{dens,pbot,pslv,ptem,shum,tbot,topo,u,v,z}

At the other end of the spectrum, @mvertens recommended saving instantaneous fields, which I interpreted to mean ./xmlchange HIST_OPTION='nhours' ./xmlchange HISTN=3 which generate a cpl.hi file every 3 hours with ~450 variables: a2x i2x_ l2x* o2x r2x_ x2a* x2i x2l_ x2r* x2oacc xaoa_ xaoo* dom[ailor] frac[aior]_

I'm guessing and hoping that I don't need all of those in CESM2, but I have questions about, for example, running an ocean (+CICE?) only case, which should(?) be forced by river run-off, which comes from CLM, which is not represented by just a2x fields. Would I need to keep the l2x and r2x fields? Only some of the variables?

@ekluzek says that for forcing CLM I just want histaux_a2x3h = .true., which "turns on coupler history stream for 3-hour average atm to coupler fields" according to the cesm1.2 web site. This generated a cpl.ha2x3h file, with the variables in the CAM4 set, above, but with 3 more variables; a2xSa{co2,topo}* But I don't know if this is sufficient in CESM2 for all the other components. And are averaged fields desirable for other components?

Erik also pointed me to a recent CLM spinup case, which used files in /glade/p/cesm/bgcwg_dev/forcing/b.e20.B1850.f09_g17.pi_control.all.297 They have the same variables as the histaux_a2x3h = .true. case, plus 14 a2xFaxa{bc,dst,oc}* variables. I haven't figured out how the additional variables were added, or whether they're actually needed.

I dug up a POP case and see some correspondence between the variables in its stream files and those in the CLM stream files, but I can't tell whether all the POP variables can be derived from the variables needed by CLM.

I haven't looked into the river, land ice, and wave models. I hope that I can define a sufficient set of variables without doing that.

Thanks for any (more) guidance on any of this. Kevin

billsacks commented 6 years ago

@kdraeder If I understand this correctly, I think that this list of variables would be similar to the list needed to force the J1850G compset. That compset has a datm but all active surface components. It was set up for the sake of spinning up CISM, by going back and forth between (a) a B compset that produces cpl hist output from atm to cpl, and (b) this J compset that cycles over this cpl hist output in datm, running all surface components.

@lofverstrom or @JeremyFyke - I forget what output you get from the coupler for this (i.e., what namelist flags you turn on). Can you please let us know?

lofverstrom commented 6 years ago

These namelist flags are turned on in the the cpl namelist (in a B compset) to produce the atm forcing data for the JG simulation.

cat >> user_nl_cpl <<EOF histaux_a2x3hr = .true. histaux_a2x24hr = .true. histaux_a2x1hri = .true. histaux_a2x1hr = .true. EOF

kdraeder commented 6 years ago

@billsacks @lofverstrom This looks like just what I need to know! If we can't afford the hourly output, can the components make due with the 3 hourly and daily? Thanks!

billsacks commented 6 years ago

@kdraeder these are the six hourly fields:

  character(CL) :: hist_a2x1hri_flds = &
       'Faxa_swndr:Faxa_swvdr:Faxa_swndf:Faxa_swvdf'

  character(CL) :: hist_a2x1hr_flds  = &
       'Sa_u:Sa_v'

If I remember correctly, POP folks are the ones who suggest using those rather than 3-hourly. You may want to talk to @klindsay28 about this if you haven't already.

ekluzek commented 6 years ago

I talked with @klindsay28 and this is the user_nl_cpl that was used to generate the most recent CPLHIST data:

histaux_a2x24hr = .true.
histaux_a2x3hr = .true.
histaux_a2x1hri= .true.
histaux_a2x1hr = .true.
histaux_r2x = .true.

That list is the same as @lofverstrom above, with the exception that it also includes histaux_r2x, which would be needed if you want to force a POP/CICE configuration, but NOT for CLM standalone.

mvertens commented 6 years ago

@kraeder - sorry if I confused the issue. I think you should use Erik's settings.

On Tue, Aug 14, 2018 at 4:51 PM Erik Kluzek notifications@github.com wrote:

I talked with @klindsay28 https://github.com/klindsay28 and this is the user_nl_cpl that was used to generate the most recent CPLHIST data:

histaux_a2x24hr = .true. histaux_a2x3hr = .true. histaux_a2x1hri= .true. histaux_a2x1hr = .true. histaux_r2x = .true.

That list is the same as @lofverstrom https://github.com/lofverstrom above, with the exception that it also includes histaux_r2x, which would be needed if you want to force a POP/CICE configuration, but NOT for CLM standalone.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ESCOMP/cesm/issues/71#issuecomment-413041044, or mute the thread https://github.com/notifications/unsubscribe-auth/AHlxE4HUPh6RgH9lb13cXMiWSRO-rWPJks5uQ1RngaJpZM4V8fZ8 .

klindsay28 commented 6 years ago

I think the answer to the question "What forcings should I generate" depends on the use cases for the forcings.

I came up with the settings mentioned by @lofverstrom and @ekluzek. The use case for the forcings that led to those choices was ocean and land BGC spinup. For the ocean BGC spinup, I wanted the ocean circulation in the forced run to be very close to the circulation of the coupled model. Based on some experimentation, I have found that having some forcings at a higher temporal frequency reduces the mismatch between the forced run using this forcing and the coupled run generating the forcing. For instance, lower frequency winds (more temporal smoothing) tends to clip the high winds. This reduces energy going into the ocean, and leads to shallower mixed layers, which impacts BGC. So I opted to generate bottom winds more frequently than 3-hourly.

I suspect that the use case(s) for the forcing you are generating is different. So it might not make sense to use the same settings that I developed. It might be the case that 3-hourly averages are reasonable. I'll point out that this is the frequency of all forcings used in our ocean/ice hindcast forcing. But I am not clear on the use cases you have in mind.

kdraeder commented 6 years ago

@klindsay28 Those are excellent points, which clarify why I'm trying to be inclusive when choosing which forcing variables and frequencies to save, while minimizing the data volume. I'm in the position of not knowing all of the use cases for which I'd like this data set to be useful.
But your use case and the CLM spinup are a great place to start, and may be sufficient.
Am I right in thinking that the BGC spinup is probably one of the more demanding use cases?
Are you aware of other, equally demanding use cases that required different high frequency variables?

The data volume of the cpl history files, even including the hourly forcing, looks manageable. I need to figure out whether the variables at different frequencies is redundant: does a2x1h_Sa_u essentially contain all of the information that's in a2x3h_Sa_u? Are any components unable to use hourly data, so the redundant 3-hourly should be included?

Thanks again for the advice!

kdraeder commented 6 years ago

I've run into one other wrinkle, which I need to understand better, and maybe fix. When I use histaux_r2x = .true. I get the cpl .hr2x. files only at hour 00, even though my forecasts end at hours 06, 12, 18, and 00.
Am I missing a way to write the hr2x files from every forecast?

This river forcing is described as instantaneous, but the time that's associated with the data in that file is half way between hour 18 and 00. If it had been a day-long forecast, the time would have been hour 12 (half way between hours 00 of the initial and final dates). In both cases the file name has ...00000.nc in it, so it looks like the data is for the day boundary, but comes from(?) different times of day. Do I need to worry about this?

billsacks commented 6 years ago

Yes, I think that's a problem. According to this code block:

https://github.com/ESMCI/cime/blob/c4f02062ee6cd0fb1e671513c28eb07582f7d4ed/src/drivers/mct/main/cime_comp_mod.F90#L3054-L3063

it looks like histaux_r2x is hard-wired to just write output every 24 hours. Are you restarting the model every 6 hours? If so, an important point is that these histaux files do not restart properly - so I think what you'd get is r2x files written every 24 hours, but just with averages from the last 6 hours (i.e., since the last restart).

One possible fix would be to write these files every 6 hours rather than every 24 hours. I'm not sure what implications that would have.

kdraeder commented 6 years ago

Thanks @billsacks Each forecast is a 'startup', and I don't anticipate that these files will be needed in a 'restart' context.

The documentation of histaux_r2x says (http://www.cesm.ucar.edu/models/cesm2/component_settings/drv_nml.html) "turns on coupler history stream for instantaneous runoff to coupler fields." If it actually is instantaneous data, then it seem like hour 21 (from my 6 hour forecast) would be just as valid as hour 12 (the default), so my once-per-day data would be good enough.
But if the coupler or the ocean makes an assumption that it is hour 12 data, that would introduce a bias that I probably don't want.

If it actually writes out averaged data, the 6 hour forecast forcing will definitely be biased compared to the 24 hour forecast.

billsacks commented 6 years ago

I'm pretty sure that documentation is wrong, and that histaux_r2x is actually 24-hour-average fields.

kdraeder commented 6 years ago

@klindsay28 confirmed that it writes average files from the forecast: 24-hour if the forecast is that long or longer, or only the length of the forecast, if less than 24 hours. So I'll need/want some sort of fix before using histaux_r2x in a data assimilation context.

If a file is written at the end of every forecast, I can average them once/day. That looks much easier than making the restart of histaux_r2x files work correctly (@billsacks thinks this is broken). The file names would need to have SSSSS added to the date stamp YYYY-MM-DD.

kdraeder commented 6 years ago

@billsacks wrote

One possible fix would be to write these files every 6 hours rather than every 24 hours. I'm not sure what implications that would have.

I tested replacing write_now=t24hr_alarm with write_now=t6hr_alarm in the src.drv/cime_comp_mod.F90 call to seq_hist_writeaux(...'r2x'...). It wrote a file every 6 hours, as hoped. I had to change the file names to include seconds. So what seems to be needed here is a way to set write_now = min(forecast_length,24_hours). It might take me a lot of digging to figure out how to access the forecast length in this module, so I'm hoping that someone can at least point me to the right variable(s) and mechanism for importing them.

I don't have permission to assign this issue to anyone, or give it labels, so it would be great if someone would do that.

Should I open a new issue about the frequency of r2x history writes?

In CIME or ESCOMP?

Does this look possible by the CESM2_1 tag/release?

Thanks, Kevin

billsacks commented 6 years ago

@kdraeder what you're asking for feels like it warrants a CIME issue. I can't address whether this is feasible for CESM2.1, but maybe @mvertens or @jedwards4b could.

kdraeder commented 6 years ago

I opened CIME #2831 and provided a fix in CIME #2832. The hr2x files are now written at time-of-day = 00000 and at the end of the forecast, no matter what time it is. The averaging is over whatever time span is in each file; 1 day for all but the last (partial) day, and the partial day for the last file (which could be 24 h).

billsacks commented 6 years ago

Thanks, @kdraeder . I've lost track of what remains in this issue; can this escomp/cesm issue be closed now?

kdraeder commented 6 years ago

There's still inconsistency between the filetype of the cpl.hr2x filenames and the contents, which are averaged. But my understanding is that that will be resolved later in a way that's least disruptive to users. Other than that, the problems I ran into have been resolved. Thanks for all the help!