Standard model / model data sets for MED recipes

headmetal commented 1 year ago

It would be great if we could identify a standard representative set of models and/or model data that can be used in the MED recipes we are developing (as opposed to each recipe using some semi-random model). There are a few good reasons to do this:

Easier to manage consistent data access / reliable data sources
Makes things much clearer and easier for end-users - both in understanding the recipes, and also with access to required NCI projects etc.
Ensure we use most representative / reliable models
Ensure consistency between recipes

To kick off, can we get some suggestions for the following domains (and please let me know if there is anything I'm missing here):

Atmosphere
Ocean
Land
Ice
Coupled

Cheers!

dougiesquire commented 1 year ago

Possibly unhelpful, but these are the model outputs that are currently included in the catalog:

https://github.com/ACCESS-NRI/access-nri-intake-catalog/tree/main/config

(see the path key under sources in each access-*.yaml)

Even across runs done with the same model, there is frustratingly little consistency in outputs

dougiesquire commented 1 year ago

Hopefully the catalog can deal with some of the inconsistencies for you. But it would be nice if we could somehow improve consistency of model outputs for future runs...

ccarouge commented 1 year ago

For Land, I don't think there is anything published. But in discussions, they typically talk of:

ACCESS-CM2 global simulations or offline simulations on the same grid. For this, CMIP6 would work fine in a first instance. I know someone who runs offline simulations on ACCESS-CM2 grid but nobody else has seen the results so far. So it might be for later.
high-resolution over Australia. For this, you can contact Anna Ukkola about her data: https://forum.access-hive.org.au/t/high-resolution-5km-australian-future-projections-from-cable/433 . I can send her email to whoever wants it. I don't suggest we need all of her dataset as reference but part of it might be good.
simulations at eddy covariance flux tower sites. But this, we are already taking care of it within the land group.

headmetal commented 1 year ago

Possibly unhelpful, but these are the model outputs that are currently included in the catalog:

https://github.com/ACCESS-NRI/access-nri-intake-catalog/tree/main/config

(see the path key under sources in each access-*.yaml)

Even across runs done with the same model, there is frustratingly little consistency in outputs

This is helpful, thanks!

I guess for now, inconsistency between model outputs isn't too much of an issue, as typically a given recipe only works with data from a single model. I absolutely intend to implement intake in the recipes to retrieve the data - but just wanted to nail down (ideally) 1 representative model per domain that we can reuse over time, e.g. choose 1 ocean model/experiment and attempt to always use it for ocean-related recipes.

dougiesquire commented 1 year ago

Unfortunately, I'm not sure we'll be able to come up with one model run output to rule all recipes, since the variables available etc differ across runs. E.g. only a few of the COSIMA model runs contain bio-geochemistry (bgc) variables for use in bgc recipes, and these runs may not contain other variables that are useful for other recipes.

After re-reading your comments, I think it's possible (probable) I'm misunderstanding what you're saying. You refer to "1 representative model per domain". Are you after a single model output per domain (e.g. a specific run of the ACCESS-OM2 model for ocean/sea-ice), or a single model per domain (e.g. ACCESS-OM2 for ocean/sea ice)?

headmetal commented 1 year ago

tbh I'm not 100% sure! haha

Would it be realistic to identify a single model per domain (e.g. ACCESS-OM2 for ocean/sea ice)?

ccarouge commented 1 year ago

Would it be realistic to identify a single model per domain (e.g. ACCESS-OM2 for ocean/sea ice)?

Nope. Not detailed enough. Currently, all the recipes for ocean are probably using ACCESS-OM2 but different experiments with different outputs, and different resolutions. There might be a few recipes with different models but not many. Some might use CMIP data, but that's simply using ACCESS-OM2 coupled to atmosphere and land.

So I think you need experiments. But I could be wrong.

dougiesquire commented 1 year ago

@headmetal and I just had a chat and decided that only using model experiments that are included in the ACCESS-NRI catalog is a easy/good way to narrow scope. Some recipes could even include comparisons across different experiments in the catalog.

aidanheerdegen commented 1 year ago

Just to throw a spanner in the works slightly, while I don't disagree with above I think it is always good to have at least some examples that can be run from anywhere, so don't depend on being on a particular HPC system.

This may mean uploading a few representative data files to a popular cloud storage repository that can be accessed from anywhere. Or maybe to the object storage we have on ARDC.

If you decided to do this I'd be tagging/marking recipes that are world-runnable, maybe with a cute little 🌐 or ☁️

dougiesquire commented 1 year ago

I think it is always good to have at least some examples that can be run from anywhere

Good point. The ability for any user to run a recipe (not just Gadi users) is actually a requirement for other cookbooks out there, e.g. https://cookbooks.projectpythia.org/

aidanheerdegen commented 1 year ago

Comments @aekiss?

aekiss commented 1 year ago

For the ocean we have the ACCESS-OM2 control runs listed here https://forum.access-hive.org.au/t/access-om2-control-runs/258

Some of these are available outside NCI from here https://dx.doi.org/10.25914/60809748351a8

dougiesquire commented 1 year ago

For the ocean we have the ACCESS-OM2 control runs listed here https://forum.access-hive.org.au/t/access-om2-control-runs/258

FYI, these are the ACCESS-OM2 experiments that are included in the ACCESS-NRI catalog

Some of these are available outside NCI from here https://dx.doi.org/10.25914/60809748351a8

@aekiss, I'm curious if you know of anyone who has actually tried to do any "real analysis" of access-om2-01 using the THREDDS server?

aekiss commented 1 year ago

Yep, we've had international collaborators download data for analysis elsewhere. And who knows how many others have made use of it - e.g I had a question yesterday from a student in Tokyo who was using it.

aidanheerdegen commented 1 year ago

Thanks @aekiss.

Can we go a bit deeper and suggest the best ocean/sea-ice variables to use in example recipes? Best would be variables that are pretty much always output so recipes can be used across all resolutions, and also prototyped on lower res data. Maybe also worth thinking of 2D and 3D use cases.

For MOM I'd think prognostic variables like temperature and salinity are obvious candidates, and they are 3D variables. The velocity components, u and v, are likewise always present, but being elements of a vector quantity maybe aren't the best options?

What are the best sea-ice variables to use?

(There is clearly some overlap with cosima-recipes and we'd want to avoid toe-stepping, but it can save a lot of time agonising over what to use, particularly for those who aren't necessarily subject matter experts).

aekiss commented 1 year ago

Kiss et al 2020 has many examples of ocean and sea ice fields that can be compared to obs. The plot scripts are here but many are broken now; some have been updated and put here.

sea_level and surface_pot_temp can be compared to obs in high spatiotemporal detail.
3dpot_temp and salt can be compared to World Ocean Atlas or WOCE transects or ARGO data in less spatiotemporal detail
Mass (tx_trans, tx_trans_int_z, ty_trans, ty_trans_int_z) and heat transports (temp_xflux_adv, temp_xflux_adv_int_z, temp_yflux_adv, temp_yflux_adv_int_z) through particular transects can also be compared to obs - see paper for examples.
Overturning transports have a few observations too
Global integrals of temp (temp_global_ave), salt (total_ocean_salt) and sea level (eta_global) are useful checks of model drift
Sea ice concentration is well-observed and can be compared to maps of aice and integrated measures such as sea ice extent
Sea ice thickness (and hence total mass) is more important to know and can be compared to hi but there are only a few obs products available, with more uncertainty and less coverage of Antarctic
Ocean BGC fields such as adic, fe, no3, o2, phy can also be compared to obs - see https://github.com/hrsdawson/ACCESS-WOMBAT_01deg_BGC_validation

Be careful with model prognostic temp - if it's conservative temperature you'll need to use pot_temp to compare with potential temperature obs.

It would also be worth talking to people who are building the new OceanMaps data assimilation in Bluelink, e.g. @PaulSandery and Pavel Sakov @sakov about their model assessment methods and the ocean and sea ice obs datasets they are using.

aidanheerdegen commented 1 year ago

Thanks @aekiss that is some great insight with the level of detail (and warnings) that is really useful.

aekiss commented 1 year ago

No worries. I've updated with a few more details.

When ACCESS-OM3 is running it will also be good to have evaluation of the WW3 surface wave outputs. This is outside my experience but there are others in COSIMA (Alex Babanin, Luke Bennetts, Alessandro Toffoli) who may be able to help.

rbeucher commented 1 year ago

Thanks @aekiss. It does help a lot!

ACCESS-NRI / dev-docs

Standard model / model data sets for MED recipes #9