COSIMA / access-om3

ACCESS-OM3 global ocean-sea ice-wave coupled model
13 stars 6 forks source link

Include frequency in history/diagnostics output filenames #191

Open anton-seaice opened 1 month ago

anton-seaice commented 1 month ago

@dougiesquire has suggested we include the frequency of output in the filenames of history/diagnostic output. For example, instead of

access-om3.cice.h.1900-01-01.nc

it could be

access-om3.cice.h.1900-01-01.day.nc

CMIP7 uses these names: fx, subhr, hr, day, mon, yr, dec

This has a few advantages:

At the same time we could harmonise how the dates are written ?

e.g. MOM uses "_" when cice uses "-" (access-om3.mom6.h.native_1945_05.nc)

anton-seaice commented 1 month ago

Is there a CMIP7 format for these filenames we should go straight to ?

dougiesquire commented 1 month ago

In MOM, the diag_table provides the user total control over what the output filenames are. So to "impose" a format we could write it into @aekiss's make_diag_table tool (and possibly add this to om3-scripts?). It's far more convenient to use this tool than to manually edit the diag_table anyway.

dougiesquire commented 1 month ago

Is there a CMIP7 format for these filenames we should go straight to ?

I think this would require always having a single file per variable which I don't thinks suits our use case (e.g. the ice data)?

aekiss commented 1 month ago

The standard ACCESS-OM2 configs use one file per MOM5 variable, with a standard filename format that was carefully designed to be self-explanatory. As @dougie mentioned, the required diag_table is generated by make_diag_table.py to avoid all the tedious and error-prone boilerplate of doing it manually. Note that make_diag_table.py can also put multiple variables in each file if preferred (this is done with scalars in ACCESS-OM2).

It would be nice to have something like make_diag_table.py for CICE output, but I'm not sure how it could be done, since CICE doesn't use diag_table. But if we're going to define a new output filename format we should at least make it similar to what we do for MOM6.

ofa001 commented 1 month ago

I was trying to tag Arnold Sullivan into this but he isnt linked to this github so I will have send him an email. Yes @aekiss you did switch to single files for access-om2 but we would prefer it back as a a single file for the existing CMIP6 post processing software package. I guess the software is still under discussion in the evaluation group, but it will be at that stage that standard CMIP7 files names which like CMIP6 will be very long to use will be set up.

CICE has an entirely different way of setting up it output fields we are talking about updating the ACCESS ESM1.6 to the CICE6.5 diagnostics (using some of the software from those routines ) so we can use the same post processing, but thats still to be actioned.

I will contact Arnold about the discussion,

aekiss commented 1 month ago

OK thanks for the heads-up re. the possibility of single files. At this stage we are working out defaults for standard configs for development and eventually users of ACCESS-OM3. Things may need to be set up differently in the CM3 and ESM3 CMIP7 production configs.

anton-seaice commented 1 month ago

@kieranricardo @martindix - is it worth trying to harmonize naming conventions with the UM / Cable ?

ars599v2 commented 1 month ago

We try to push very hard for the community to use the CMORised data, not raw data.

CMIP table_id already has "frequency" tabs.

It is a good approach if the model output can be directly converted to CMIP format (cmorise). If we had APP5 to handle it, then that would be great. The primary purpose is to submit the data, right? It is not just for our local community to use.

Test: if we can use the new format to calculate the CMIP standard variable by using the new APP5 or other cmorise tool, then that is a great approach:

msftyrho,yes,ty_trans_rho ty_trans_rho_gm,"meridionalOverturning(var,'full')",kg s-1,dropX basin gridlat,,both,ocean, msftmrho,yes,ty_trans_rho ty_trans_rho_gm,"meridionalOverturning(var,'full')",kg s-1,dropX basin,,both,ocean, msftyz,yes,ty_trans ty_trans_gm ty_trans_submeso,"meridionalOverturning(var,'full')",kg s-1,dropX basin gridlat,,both,ocean, msftmz,yes,ty_trans ty_trans_gm ty_trans_submeso,"meridionalOverturning(var,'full')",kg s-1,dropX basin,,both,ocean,

sisnthick,yes,sisnthick,,m,,,CM2,seaIce, sispeed,yes,sispeed,,m/s,,,CM2,seaIce, sistrxdtop,yes,sistrxdtop,,N m-2,,down,CM2,seaIce, sistrydtop,yes,sistrydtop,,N m-2,,down,CM2,seaIce,

ofa001 commented 1 month ago

I think @aekiss is correct its probably going to need different approaches for different communities with the COSIMA community still using ''the cookbook' perhaps now though @dougiesquire prefers through the intake catalogue if the data is set up in that format. Whilst the CMIP7 data needs to be cmorized. The intake catalogue can handle cmorized data. I guess this will all be discussed more in the evaluation working group and in the wider community at the ACCESS-NRI workshop.

anton-seaice commented 1 month ago

CMIP table_id already has "frequency" tabs.

In theory, we can hope that data is accessed through the intake catalogue, or through other tools (we know this isn't true ofcourse). Through the metadata in the source data, those tools will provide the frequency to the user. So to the end user the filename shouldn't matter. I think we are setting the filename as a convenience to developers (and maybe its used by intake).

sisnthick,yes,sisnthick,,m,,,CM2,seaIce, sispeed,yes,sispeed,,m/s,,,CM2,seaIce, sistrxdtop,yes,sistrxdtop,,N m-2,,down,CM2,seaIce, sistrydtop,yes,sistrydtop,,N m-2,,down,CM2,seaIce,

CICE6 has an option to enable output using the cmip variable names, so I think this will work for sea ice output in OM3. But I haven't tested it.

aekiss commented 1 month ago

MOM6 can also output using CMOR names https://mom6.readthedocs.io/en/main/api/generated/pages/Diagnostics.html#apis-for-diagnostics

ars599v2 commented 1 month ago

Many thanks @anton-seaice @aekiss,

I have no further questions. I believe that the MOM output with extra frequency can still be handled using the APP (CMORise package).

@aekiss, I think MOM6 CMOR format outputs should just for one-to-one variables, e.g., temp -> thetao, temp[:, :, :, 0] -> tos. For specific cases like hdfs, heat budget analysis, or different basins, we would need APP4 to handle it.

It would be great to ask the MED team (Romain) to double-check this. But at this stage, the ocean output in frequency format is fine.

aekiss commented 1 month ago

ping @rbeucher

rbeucher commented 1 month ago

That would be great. Anything that can alleviate the need for loading multiple files would be good

ars599v2 commented 1 month ago

@rbeucher once they introduce the new frequency for the ocean output, we then need to change APP4 and maybe MOPPER

            elif realm == 'ocean':
                if 'scalar' in axes_modifier:
                    file_structure='/ocn/ocean_scalar.nc-*'
                elif freq == 'mon':
                    file_structure='/ocn/ocean_month.nc-*'
                elif freq == 'yr':
                    if access_version.find('OM2') != -1:
                        if axes_modifier.find('mon2yr') != -1:
                            file_structure='/ocn/ocean_month.nc-*'
                        else:
                            file_structure='/ocn/ocean_budget.nc-*'
                            if exptoprocess == '025deg_jra55_iaf_omip2_cycle6':
                                axes_modifier='{} mon2yr'.format(axes_modifier)
                    else:
                        file_structure='/ocn/ocean_month.nc-*'
                elif freq == 'fx':
                    #if access_version.find('OM2') != -1:
                    #    file_structure='/ocn/ocean_grid.nc-*'
                    #else:
                    file_structure='/ocn/ocean_month.nc-*'
                elif freq == 'day':
                    file_structure='/ocn/ocean_daily.nc-*'
                else:
                    #Unknown ocean frequency
                    file_structure=None
aekiss commented 1 month ago

Linking to related discussion: https://github.com/COSIMA/access-om3/issues/190#issuecomment-2251936713

rbeucher commented 1 month ago

@aidanheerdegen following our discussion this morning. I think we can change APP4 and MOPPER.

dougiesquire commented 1 month ago

I think we've finialised a format - see here

aekiss commented 1 month ago

almost there - see here