aekiss commented 4 years ago

I'd like to settle on a standard diag_table for the default configurations at each resolution.

Some basic objectives and constraints:

identical diag_table for IAF and RYF at each resolution
as close to identical as possible across resolutions to facilitate resolution-dependence studies
one file per 2d or 3d field, with descriptive filename including date e.g. ocean-3d-temp-1-monthly-mean-ym_1959_01.nc, as discussed in https://github.com/COSIMA/access-om2/issues/185; scalars fields share one file to reduce file count
3d field sampling: annual at 1 and 0.25 degrees (should this be monthly, at least for some fields?), monthly at 0.1 deg (reduced to 3-monthly in 0.1deg spinup); NB: annual sampling is impossible at 0.1 deg because these runs are 3 months long
2d field sampling: mostly monthly at all resolutions, but a few 2d fields have daily sampling
files produced annually at 1 and 0.25 degrees and 3-monthly at 0.1 degrees (except for daily 3d data, which have one file per month to keep the size reasonable)

I'm defining these via diag_table_source.yaml files using make_diag_table. The outputs are specified in the diag_table section of the yaml file.

Here are the files I'm proposing we use:

1 deg: diag_table_source.yaml and resulting diag_table
0.25 deg (same as 1 deg): diag_table_source.yaml and resulting diag_table
0.1 deg: diag_table_source.yaml and resulting diag_table

Some questions:

would monthly 3d outputs be preferable to annual at 1deg and/or 0.25 deg? At least for some fields that might be used for seasonal studies?
I've added some squared fields (u, v, eta_t) to support calculating EKE and SSH variance, including all timescales. This may be less useful than it seems as it will include very short-timescale (non-geostrophic) variability and inertial oscillations etc. Has anyone tried this and is it worthwhile?
Should we use ssh instead of eta_t? I think I recall @StephenGriffies saying the latter is better if the ice is non-levitating. (We have levitating ice, i.e. max_ice_thickness=0, but maybe this will change at some point).
Am I missing anything important? Remember I'm only looking for defaults that almost all runs should have, not specialised things that only you would have an interest in (we need to be sensibly economical with storage, both Gb and inodes).
Is there anything that should be removed?

AndyHoggANU commented 4 years ago

A couple of quick thoughts:

Yep, I think we can afford 3D monthly variables as standard at 025deg and 1deg.
Am not convinced, but I doubt it hurts.
do you mean sea_level? That is what I tend to use. I think it is the same as eta_t right now, so I would stick with that, unless we switching to non-levitating sea ice, which is not a priority.
I think we have fixed the bug with net_surface_heating, right? If yes, we should save that again. Also, maybe we could go with daily net_surface_heating to replace all the daily hflux variables? Finally, I think pme misses the restoring contribution to freshwater, and perhaps the ice contributions -- is there a single FW flux term we can use there. One more thing - is there an appetite to add daily Ekman pumping? This might be very interesting to look at, but not sure if we have diagnostic for that.

aekiss commented 4 years ago

ok I'll make that change, with 12 months per output file to reduce file count
harmless if we can spare the storage, particularly with the 3d fields at 0.1deg
yes, I meant sea_level
the net_surface_heating bug is still an open issue (https://github.com/COSIMA/access-om2/issues/139) so presumably it hasn't been fixed?

I've also added

maximum MLD in addition to the mean, as the mean is affected by occasional spurious low values due to rainfall
min SST (for comparison with foundation temperature)

not sure if they're worthwhile but they're 2d so not much of a cost

StephenGriffies commented 4 years ago

A/ There might be some interest with the extremes folks for the following fields computed daily:

daily 2d fields

surface_temp_max bottom_temp_max bottom_temp sea_level_max

B/ There are eta_t and sea_level diagnostics sprinkled throughout. As noted earlier in this thread, they are equal when sea ice fully levitates. More generally, the sea_level field is what folks find more useful as you will not see the abrupt depression due to sea ice pushing down on eta_t. Furthermore, upon removing the global area integrated mean,

zos = sea_level-areamean(sea_level) = dynamic sea level requested for CMIP

C/ You might consider saving snapshots of surface velocity and vorticity if wishing to make pretty pictures. But that perhaps is better done for special runs rather than for all runs...

aekiss commented 4 years ago

OK thanks, I've made those changes - see diffs above.

net_surface_heating is omitted, awaiting a fix for issue #139

C/ I've put the suggested fields as comments in the snapshots section so they can be easily enabled, but as 3d fields as in practice we often prefer fields from ~30m depth to avoid Ekman currents. A regional section at just this depth level is probably more sensible though.

rmholmes commented 4 years ago

Thanks Andrew. A few comments:

At 1-degree and 1/4-degree you've got temp_xflux_adv_int_z, temp_xflux_submeso_int_z but not temp_xflux_ndiffuse_int_z and temp_xflux_gm_int_z - any reason?
Isn't sw_heat listed in the monthly 2d fields section a 3D heat budget diagnostic? You already have swflx?
I can look at the net_surface_heating bug, but I won't get to it until next week.

aekiss commented 4 years ago

Thanks @rmholmes, glad you're casting an eye over the heat diags - I'd just copied things from the old configs without giving it any real thought.

there's no reason that temp_xflux_ndiffuse_int_z and temp_xflux_gm_int_z are missing - should I add them? And temp_yflux_ndiffuse_int_z and temp_yflux_gm_int_z too? Or anything else?
yes, sw_heat is 3d, well spotted. Is it valuable enough to save as a monthly output?
thanks for offering to look at the net_surface_heating bug, that would be great

rmholmes commented 4 years ago

Yes I'd add all four of those.
I would remove sw_heat as it's not particularly useful unless you have all the other 3D heat budget diagnostics (unless someone else specifically requested it?). I would keep monthly swflx as it is a 2D field.

abhisheksavita commented 4 years ago

Thanks Andrew... In the diag_table you included ty_trans_gm and ty_trans_submeso but excluded tx_trans_gm and tx_trans_submeso. Can we include these two diagnostics as well although 0.1 degree just need to tx_trans_submeso.

aekiss commented 4 years ago

thanks @abhisheksavita - I don't have ty_trans_gm but there's ty_trans_rho_gm, so did you mean I should include tx_trans_rho_gm, not tx_trans_gm?

aekiss commented 4 years ago

We have a whole bunch of transport terms in a mix of z, rho and neutral rho coordinates - are they a sensible combination? And do we need all of these in default configs? They're all 3d, so require a lot of storage.

            tx_trans_rho:
            tx_trans:
            ty_trans_nrho_submeso:
            ty_trans_rho_gm:
            ty_trans_rho:
            ty_trans_submeso:
            ty_trans:

abhisheksavita commented 4 years ago

At least we can save the ty_compoents of each which we use a lot to see the residual meridional overturning circulation. i.e

ty_trans_rho

ty_trans_rho_gm

ty_trans_nrho_submeso

ty_trans

ty_trans_gm

ty_trans_submeso

But if we afford we can have it tx_components which we use a lot to see the residual barotropic circulation i.e

tx_trans_rho

tx_trans_rho_gm

tx_trans_nrho_submeso

tx_trans

tx_trans_gm

tx_trans_submeso

Note that in 1/10 degree we don't need to save the gm components.

On Thu, May 28, 2020 at 6:33 PM Andrew Kiss notifications@github.com wrote:

We have a whole bunch of transport terms in a mix of z, rho and neutral rho coordinates - are they a sensible combination? And do we need all of these in default configs? They're all 3d, so require a lot of storage.
        tx_trans_rho:
        tx_trans:
        ty_trans_nrho_submeso:
        ty_trans_rho_gm:
        ty_trans_rho:
        ty_trans_submeso:
        ty_trans:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/COSIMA/access-om2/issues/203#issuecomment-635200213, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH6BSQGJ7L4FDGYSPAVAO2TRTYOUHANCNFSM4NJ7QCCA .

-- Regards Abhishek Savita

Research Scholar (Earth System Science Technology) Center For Oceans, Rivers, Atmosphere & Land Science Technology Indian Institute of Technology, Kharagpur +91-8609704619

rmholmes commented 4 years ago

My opinion would be:

Keep ty_trans_rho, ty_trans_rho_gm and ty_trans_nrho_submeso since we regularly want to plot the meridional overturning circulation in density coordinates. (However, everyone who uses the gm and submeso ones should realise that they aren't actually real - these parameterizations are implemented through skew-diffusion not advectively). Note that there is no diagnostic ty_trans_rho_submeso - but I can't remember why (@russfiedler?).
Drop all the z-space diagnostics ty_trans, tx_trans, ty_trans_submeso, tx_trans_submeso etc. - does anyone use these? We don't plot depth-space overturning circulations anymore. The transports can be constructed approximately using the velocity field - and if you're making a plot of the vertical structure of the velocity field you want to use velocity (which is grid-thickness independent) rather than transport. @abhisheksavita why do you want to save these?
Make sure we have the vertically-integrated transport diagnostics so that we can calculate BT streamfunction (tx_trans_int_z, ty_trans_int_z etc.). If you're only interested in the BT streamfunction it's much better to save only 2D not 3D variables. I'm pretty sure the GM and submeso schemes don't contribute any BT circulation so no need to save those diagnostics.
Drop the zonal density-space diagnostics - I don't think anyone uses these either?

Obviously I'm advocating for dropping variables - so people need to speak up if they use them...

abhisheksavita commented 4 years ago

Thanks @rmholmes I agree with you but I do use all z space transport diagnostics to calculate residual circulation but don't know who else use this. I didn't realised that we can construct the BT function using tx_trans_int_z, ty_trans_int_z that's really very good idea a to save 2D rather than 3D diagnostics.

AndyHoggANU commented 4 years ago

I think some folks still use tx_trans and ty_trans for some things, even though I would argue against their use ...

aekiss commented 4 years ago

tx_trans and ty_trans have been used for strait transports, e.g. https://github.com/COSIMA/ACCESS-OM2-1-025-010deg-report/blob/master/figures/strait_transports/strait_transports.ipynb

For reference, these are the only MOM diags used in the notebooks in https://github.com/COSIMA/ACCESS-OM2-1-025-010deg-report/tree/master/figures (which covers nearly all the figures used in the GMD paper, plus some more)

age_global
aiso_bih
area_t
eta_global
eta_t
geolat_t
geolon_t
ke_tot
kmt
mld
net_sfc_heating
pot_rho_0
pot_rho_2
potrho_edges
salt
salt_global_ave
salt_surface_ave
sea_level
sea_levelsq   <--- should be sea_level_sq - not sure why the old .nc files call it sea_levelsq
temp
temp_global_ave
temp_surface_ave
temp_yflux_adv_int_z
temp_yflux_gm_int_z
temp_yflux_ndiffuse_int_z
temp_yflux_submeso_int_z
total_ocean_salt
tx_trans
tx_trans_int_z
ty_trans
ty_trans_int_z
ty_trans_rho
ty_trans_rho_gm
u
v
vert_pv
vorticity_z

rmholmes commented 4 years ago

As far as I can see, all the results in the strait_transports script can be obtained using the int_z diagnostics, except for the vertical structure plots. For these, shouldn't they be plotted using velocity rather than transport (i.e. so the magnitude is m-1 rather than being dependent on the grid size)?

@abhisheksavita - I see, you need tx_trans, ty_trans etc. for your global binning of transports into depth-temperature coordinates using the monthly temperature field. There's no capability to do this properly online at the moment because it requires a 5D binning code (retaining both depth and temperature information).

I guess it all depends on how much storage we have and what the priorities are.

aekiss commented 4 years ago

I agree the strait_transports script should be using velocity, not transport in the vertical sections.

aekiss commented 4 years ago

What we include by default depends on how much people will be depending on shared output as opposed to running their own simulations with the diags they want.

@abhisheksavita will you be doing your work with your own runs or will you need others to also routinely store the diags you need?

I'm wondering whether this would be OK: all z-coord and all zonal 3d transport diags commented out, available for whoever wants to do their own runs? https://github.com/COSIMA/1deg_jra55_iaf/blob/diag_table/ocean/diag_table_source.yaml

aekiss commented 4 years ago

Other 3d fields we might not need to save routinely:

dzt - @rmholmes do you need this? https://github.com/COSIMA/access-om2/issues/142
buoyfreq2_wt - is it close enough to calculate this from density?
vert_pv
wt
do we need both pot_rho_0 and pot_rho_2?

Note that since we'll be saving one variable per file it will be easy to remove any outputs we don't want, so maybe we don't need to be so careful...

rmholmes commented 4 years ago

Yes you can remove dzt as a standard variable.

abhisheksavita commented 4 years ago

Thanks @aekiss actually for my own work I do set my own runs and diagnostics but I think still few people use z level diagnostics and it is worth to have at least ty_trans, ty_trans_gm and ty_trans_submeso.

russfiedler commented 4 years ago

@rmholmes No idea why there isn't a ty_trans_rho_submeso (and friends). It was never implemented in the original code so I didn't add it when making optimised versions.

The transports take into account partial cells and varying thickness and are also correctly calculated on the faces of tracer cells which is consistent with all other transport calculations. Everything is done as a sum rather than integration. As pointed out most really use the fully integrated form.

For BRAN/OFAM we're testing how well wt can be diagnosed from knowledge of U, V and eta_t. We got caught out with bad packing factors and don't want to rerun things. Matt Chamberlain found that bit grooming wt to 2 sig figs worked pretty well and reduced the compressed size of the fields by quite a bit (well a few bits).

aekiss commented 4 years ago

ok I've updated the proposed standard: https://github.com/COSIMA/1deg_jra55_iaf/blob/diag_table/ocean/diag_table_source.yaml dzt is now removed and all 3d meridional transport components are now uncommented. All zonal transport components remain commented out. How's that look?

russfiedler commented 4 years ago

Don't you want snapshots at the end of months of 3D global fields like total heat and total salt in order to perform tracer budgets correctly? The double precision storage is pointless otherwise. Also, calculating those 3D integrals every time step murders performance. Even daily snapshots are probably ok and approximate monthly means can be quickly calculated if needed.

aekiss commented 4 years ago

Thanks @russfiedler, that's an interesting thought. Monthly snapshots of some scalar fields might suffer from aliasing of high frequencies, but this could be reduced using monthly means of daily scalar snapshots. @AndyHoggANU how does that sound?

russfiedler commented 4 years ago

This is an issue that cropped up a while back with Fabio (I think). Storing the snapshots of global heat allowed us to close budgets properly and track down problems.

aekiss commented 4 years ago

re. removing dzt - as @russfiedler explained here we would need the static fields depth, dstlo and dstup to calculate dzt from sea_level. These are available in restarts but we don't normally include restarts in the cookbook DB.

We could save the static fields depth, dstlo and dstup once per run, so for each 3-month run at 0.1deg we'd be saving two 3d fields (dstlo and dstup) instead of three 3d fields (monthly dzt), which isn't much of a saving for this amount of hassle.

Also there are diagnostics for dstlo and dstup but what should be used for depth?

It seems silly to save static data (e.g. grid data) for every run when they are always the same. @angus-g is there any reason to do this from the cookbook point of view or could we just save them for the first run and omit them from the rest?

angus-g commented 4 years ago

Usually when people query static data out of the cookbook, they need to limit the search to one file anyway, to prevent concatenation. If the file is present at least once for an experiment and it's picked up in the database, there shouldn't be any problems querying for it.

aekiss commented 4 years ago

ok that's good to hear

rmholmes commented 4 years ago

Re: Saving snapshots. Isn't this only useful if you also save all the tracer budget diagnostics? I use them all the time for this purpose but we don't want to output all the tracer diagnostics for every run.

On the other hand, we are saving all the diagnostics needed to close the vertically-integrated heat budget - so snapshots could be useful for that (but just temp_int_rhodz snapshots would be sufficient).

PS: We are working on a new type of time-averaging which will allow us to precisely relate time-integrated tendencies to the difference in time-averaged quantities (as opposed to just differences in snapshots). Hopefully this will be included in the code at some stage.

aekiss commented 4 years ago

Re. dzt: Several different people use these outputs and have existing workflows using dzt so I'll leave it in for now and we can delete it later if we need the space and have established a way to recreate it from other data.

russfiedler commented 4 years ago

Maybe the way around saving static quantities is to use the external_variable/file feature in CF standard. http://cfconventions.org/cf-conventions/cf-conventions.html#external-variables There's no standard for the file naming though. CM4 uses something similar but still puts out static fields all the time. associated_files = "areacello: 20150101.ocean_static.nc" ;

StephenGriffies commented 4 years ago

Sorry to be adding to the discussion so late...the benefits and cost of 14hour different time zones. Here are some comments about the above deliberations.

--I have found that dzt is very useful to have around, especially monthlies. As @russfiedler said, it contains partial bottom cell information and sea level. The most critical piece is the partial bottom cells for purposes of budgets. Yea, one can do the calculation offline with static fields + eta_t. But that can be a hassle, as I think others have noted. So I vote for keeping dzt monthly.

--I am NOT a fan of constructing transports using velocity. Again as @russfiedler noted, we should use tx_trans and ty_trans if looking to get transport as a function of depth, or as @rmholmes noted, use their int_z integrated forms for the full depth integrated transports. The face geometry and time dependent thickness make offline calculation of transports very dodgy and difficult. Yes, one might get a sensible approximation, but the fuss is really not worth it when one can get the transports precisely computed online in the model, with sums used for offline integration across a line. In fact, I added the transports to the CMIP6 data request just for these reasons.

Furthermore, if one knows a priori the sections that one wishes to compute transports, such as the ~15 sections in the OMIP data request, then one can enable the diag-table with these straits to save the tx_trans or ty_trans just for the selected lines. This "section" diagnostic can be especially interesting when saving temp, saln, u, tx_trans for a section such as the Drake Passage.

--We tried to compute wt offline using u,v,eta_t when Adele Morrison was a Princeton post-doc. After heaps of fussing around, she got things to work sensibly but it was always a fuss and something that really led to head-aches in the long run. We wanted wt to compute Lagrangian particle trajectories, and not having wt saved online was a near disaster until she got something working sensibly. So please do not think that just because ACCESS-OM is a Boussinesq model that wt is simple to get offline.

The difficulty is that the model uses a time dependent vertical coordinate, so one needs u,v,eta_t as well as the time tendency of eta_t. Basically one needs to diagnose terms in a time-dependent thickness equation much like one would need if doing so in an isopycnal model. Until Adele took into account all of the terms, including the time tendency of eta_t, results were not trustworthy. And even when she did use all of the terms, we still wished we had saved wt online.

So in summary, as these simulations are likely to be used to study upwelling and Lagrangian particles, wt should be saved online. That hassle and uncertainty are not worth the savings in archive space.

aekiss commented 4 years ago

Thanks Steve, it's always good to have the benefit of your experience. I was still saving monthly dzt, ty_trans and wt but I'll now enable monthly tx_trans as well.

It's also good to be reminded of the possibility of output along sections - that could be good for more specialised uses (e.g. higher than monthly temporal resolution), but for these general-purpose configurations I think it will be best to save 3d monthly transports globally as we don't know a priori what the data will be used for.

aekiss commented 4 years ago

@AndyHoggANU - re. parts of your point 4 I hadn't responded to:

@rmholmes' PR https://github.com/mom-ocean/MOM5/pull/320 fixes issue https://github.com/COSIMA/access-om2/issues/139 so I'll add net_surface_heating and total_net_sfc_heating
I don't have a preference either way re. retaining all the component heat fluxes - since they are 2d I'd be inclined to leave them in. They are easily deleted if we need to save space since they're in separate files.
We apply salinity restoring via a salt flux (use_waterflux = true, salt_restore_as_salt_flux = true) so it makes no contribution to pme - at least, that's how I understand this code block (only the else clause applies). @russfiedler - have I got this right?
It appears that sea ice melt/formation is already included in pme here.
the ekman_we diagnostic seems to be what you want, so I'll add that: https://github.com/mom-ocean/MOM5/blob/master/src/mom5/ocean_core/ocean_sbc.F90#L5256-L5280

aekiss commented 4 years ago

Is there a consensus on whether to implement Russ' suggestion of snapshots at the end of months of 3D global fields like total heat and total salt?

Even if we don't want to form budgets it might improve performance. On the other hand, it might be less useful due to aliased high frequencies. If nobody cares either way I'll leave them as monthly averages.

rmholmes commented 4 years ago

If you want to save all the component heat fluxes you'll also need to add mh_flux (as well as reenabling net_sfc_heating).

adele-morrison commented 4 years ago

I think snap shots of temp and salt at the end of each year would be useful. e.g. In Ruth's recent paper she did Antarctic regional heat budgets using just surface fluxes, lateral heat fluxes (temp_yflux_adv/temp_xflux_adv) and heat content change from snapshots (she got these from restarts I think). Even without the full heat budget, this kind of thing can still be useful.

However I don't think there's much benefit of saving at monthly frequency. What about outputting just once at the end of each 3 month run?

rmholmes commented 4 years ago

Yes I agree. As I mentioned above you can close the vertically-integrated budgets. However, you only need to save snapshots of temp_int_rhodz and salt_int_rhodz rather than temp and salt if this is the case. Otherwise, if you save snapshots of temp and salt you should also save snapshots of dzt (or snapshots of eta_t and reconstruct it as Russ has mentioned). They're also in the restarts if you save them frequently enough.

russfiedler commented 4 years ago

The salt restoring shouldn't make a contribution to pme. The saving of info at odd intervals like the end of (3 month say) runs may lead to problems. Say the mighty TWG achieve a breakthrough halfway through the run and 4 month cycles can be run. Things get messy. You want to specify 3 months or whatever no matter what. If you just want the end of the year that's fine too. Some runs spit it out, others don't.

adele-morrison commented 4 years ago

Yes, agreed Ryan, I think the best we can hope to do is vertically integrated budgets. So then there's no point in saving 3D snapshots of temp/salt.

aekiss commented 4 years ago

I think we may have been talking at cross-purposes - @russfiedler can you clarify your suggestion above please?

I understood you were referring to 3d integrals (ie scalar timeseries) such as total_ocean_heat and temp_global_ave.

Is this right, or were you meaning actual 3d fields such as temp?

russfiedler commented 4 years ago

@aekiss Yes, I was only concerned with the 3D integrals.

aekiss commented 4 years ago

Aha, thanks @russfiedler.

@rmholmes, @adele157, @AndyHoggANU - do you have any preferences for whether the scalar timeseries in the monthly scalar timeseries section (such as total_ocean_heat and temp_global_ave) should be monthly means or monthly snapshots?

Russ says monthly means of these scalars murders performance, so we may get some model speedup if we move to snapshots (at the cost of noisier-looking timeseries with aliased high frequencies). Snapshots could also be saved more frequently (e.g. daily), and we'd still get nearly all the speedup. Snapshots are also better for globally-integrated budgets.

I think I'm convinced that snapshots of scalars are a better idea. Any objections? We can also save some as means and some as snapshots if need be. Just let me know.

rmholmes commented 4 years ago

Yes I think snapshots for scalars is a good idea - there are plenty of benefits. Also, since they are global metrics they will not be affected much by high-frequency noise.

The only issue is that the heat fluxes (e.g. total_net_sfc_heating) should really be averages in order to close the global budget properly (e.g. to attribute changes in the total_ocean_heat snapshots). But we can always area-integrate the 2D monthly-average fluxes if that accuracy is required.

aekiss commented 4 years ago

OK how about I set everything in the scalar timeseries section to daily snapshots? That would give more accurate averages and still have most of the performance benefit of snapshots.

adele-morrison commented 4 years ago

Sounds good to me, I've only really used the scalar output for monitoring runs, not for actual science.

aekiss commented 4 years ago

OK done - all scalar diagnostics are now daily snapshots in all 6 configurations. I believe I've made all the changes discussed above:

1 deg: diag_table_source.yaml and resulting diag_table
0.25 deg (same as 1 deg): diag_table_source.yaml and resulting diag_table
0.1 deg: diag_table_source.yaml and resulting diag_table

(these are IAF configs; RYF configs are identical)

rmholmes commented 4 years ago

@aekiss I was looking through the new 1-degree diag_table just now (from here) and noticed you have missed temp_xflux_gm_int_z and temp_yflux_gm_int_z. Would be good to add them.

aekiss commented 4 years ago

Thanks @rmholmes, good catch. They're also missing from 0.25 so I can add them there too. What about temp_xflux_submeso_int_z and temp_yflux_submeso_int_z? They're also missing.

COSIMA / access-om2

new standard diag_tables for each resolution #203

daily 2d fields