ACCESS-Community-Hub / Land-ancillary-creation

A repository containing scripts to create ancillary files for land surface models for ACCESS
0 stars 0 forks source link

Create preprocessor for CABLE inputs. #1

Open bschroeter opened 2 weeks ago

bschroeter commented 2 weeks ago

Something that has come out of discussions is the need to have a consistent and standard set of input variables going in to CABLE, such that the model does not need to do the translation and derivation of variables internally - and so that it is fundamentally clear what goes in to the model.

Currently, here is the list of variables that are required (possibly outdated): https://cable.readthedocs.io/en/latest/user_guide/inputs/meteorological_forcing/

The ALMA standards are the agreed set of forcing variables of the community, it seems logical that we follow these: https://ncitest.modelevaluation.org/variableStandards

At least the core variables should be required, however, it makes sense that users may wish to provide the optional variables.

Suggest that the user be required to provide the core variables and optionally any of the required variables to a preprocessor. The preprocessor will then derive the missing optional parameters from those it can from the core and provide the full set (core + optional) to CABLE.

We will probably need to require that input data be properly named / attributed (CF-Compliant?) and will need to compute the quantities following the appropriate methodology (likely translated from Fortran to Python).

Given the large volume and numbers of files, there will need to be substantial efforts to parallelise these calculations where possible.

bschroeter commented 2 weeks ago

If we require input data for the preprocessor to be CF-compliant (recommended), then we can use something like https://cf-xarray.readthedocs.io/en/latest/quickstart.html to generically translate to ALMA standard.

bschroeter commented 2 weeks ago

We will also need to convert units etc.

Whyborn commented 2 weeks ago

As part of this, we need to decide precisely which variables CABLE requires. I think we should specify exactly what forcing CABLE requires e.g.

CABLE cannot run without this (or some other prescribed set) of variables, and doesn't take any other optional forcings. Mappings between optional forcings to required forcings occur within the preprocessor.

bschroeter commented 2 weeks ago

Following discussions with @ccarouge and @Whyborn, there is around 35TB required for a 75yr run in terms of met forcing.

Ultimately this says we should preprocess up front, publish the preprocessed data and point users to that.

ccarouge commented 2 weeks ago

For the derivation of variables, we could use MetPy: https://unidata.github.io/MetPy/latest/userguide/index.html

It contains a lot of the derivations we will need and these are documented with provenance.