AusClimateService / data-code-group

Data and Code Group
1 stars 3 forks source link

Does proposed DRS work for downstream model products (e.g. coastal) #16

Open hot007 opened 1 year ago

hot007 commented 1 year ago

We need to consider if the current proposed DRS is sufficient for downstream models (e.g. SCHISM coastal model) or if we need to propose extensions.

Current DRS proposal: Directory structure: /g/data/ia39/australian-climate-service/<status>/<activity>/<product>/<domain>/<RCM-institution>/<GCM-model-name>/<CMIP6-experiment-name>/<CMIP6-ensemble-member>/<RCM-model-name>/<RCM-version-ID>/<frequency-or-category>/<variable-name>

Filenaming: <variable-name>_<domain>_<GCM-model-name>_<CMIP6-experiment-name>_<CMIP6-ensemble-member>_<RCM-model-name>_<RCM-version-id>_<frequency-or-category>[_<StartTime-EndTime>].nc

Example from existing wave data directory: COWCLIP/global/CSIRO/MRI-CGCM3/historical/r1i1p1/glob/v201908/ann/Hs which maps to <product>/<domain>/<RCM-institution>/<GCM-model-name>/<CMIP6-experiment-name>/<CMIP6-ensemble-member>/<RCM-model-name>/<RCM-version-ID>/<frequency-or-category>/<variable-name> file: Hs_glob_CSIRO_MRI-CGCM3_historical_r1i1p1_ann_1979-2004.nc which maps to <variable-name>_<domain>_<RCM-institution>_<GCM-model-name>_<CMIP6-experiment-name>_<CMIP6-ensemble-member>_<frequency-or-category>_<StartTime-EndTime>.nc

Note the directory structure maps well but the file naming will need to be changed from my previous example - that isn't a problem to do at all, I will work with the modellers to ensure output is structured in this way for publication/sharing. My above example was based on existing CORDEX data where files include the RCM institution in the filename.

So an example SCHISM output might look like /g/data/ia39/australian-climate-service/release/WP3/output/national_mesh/CSIRO/ACCESS-CM2/historical/r1i1p1f1/SCHISM/5.10_v1/1hr/Hs and files will need substantial post-processing to split out per variable but that's okay. Note we'll be using the CF and UGRID conventions, this data may be very large if we split it per variable with metadata overheads, I'm not sure.

Proposed additions are: product = coastal_hazards OR WP3 OR ??? domain = national_mesh ??? It's a national unstructured mesh. WW3 will use an SMC grid. Neither will be cordex "domains" RCM-model-name = SCHISM, WWIII (others may also be required?)

hot007 commented 1 year ago

If this seems agreeable to the group, let me know and I'll socialise it with the WP3 folk (specifically the SCHISM modelling bits, I think we should ignore the new WWIII hindcast for these purposes as I suspect it won't be going to ia39 at any point).

hot007 commented 1 year ago

Question: is exclusion of RCM-institution in the file names intentional, given it does appear to depart from CORDEX convention?

hot007 commented 1 year ago

Oh, also we'll definitely need to add a bunch of new variable_names, TBD. Some will match existing CMIP variables so we can use them where possible (e.g. currents) but I suspect any variables associated with waves or sediment will be new to the vocab.

hot007 commented 2 months ago

Our datasets are produced and in validation now ready for preparation for publication. It'd be great to talk to @DamienIrving and others in the code & data group to confirm the above and whether we want the coastal data to adhere to the same standards. Planning to postprocess and publish by August...

DamienIrving commented 2 months ago

It's worth noting that for the CORDEX-CMIP6 activity (i.e. the bias correction) we stuck very closely to the CORDEX-style DRS defined by data_standards.md. e.g.

/g/data/ia39/australian-climate-service/test-data/CORDEX-CMIP6/bias-adjustment-output/AGCD-05i/BOM/CMCC-ESM2/ssp126/r1i1p1f1/BARPA-R/v1-r1-ACS-QME-AGCD-1960-2022/day/tasmaxAdjust/tasmaxAdjust_AGCD-05i_CMCC-ESM2_ssp126_r1i1p1f1_BOM_BARPA-R_v1-r1-ACS-QME-AGCD-1960-2022_day_20380101-20381231.nc

That CORDEX-style DRS probably also makes sense for your coastal data since you're also taking CMIP6 data and pushing it through a different model (at least from your examples above it looks like that DRS works out well).

We had to completely abandon that CORDEX-style DRS for the QDC-CMIP6 activity since there's no regional model in that case, so if necessary I think we can also live with different DRSs for different activities. e.g.

/g/data/ia39/australian-climate-service/test-data/QDC-CMIP6/BARRA-R2/CESM2/ssp245/r11i1p1f1/day/pr/2035-2064/pr_day_CESM2_ssp245_r11i1p1f1_AUS-11_2059_qdc-multiplicative-q1000-linear_BARRA-R2-baseline-1985-2014_model-baseline-1985-2014.nc
hot007 commented 2 months ago

Thanks @DamienIrving , those are useful examples. @janomecopter , this issue thread, and related page (https://github.com/AusClimateService/data-code-group/blob/main/data_standards.md) is what we need to post-process the CCHaPS data into. I can work with you on this :) We'll close this issue once we've got a working prototype.