Open anton-seaice opened 3 months ago
Is there a CMIP7 format for these filenames we should go straight to ?
In MOM, the diag_table
provides the user total control over what the output filenames are. So to "impose" a format we could write it into @aekiss's make_diag_table
tool (and possibly add this to om3-scripts
?). It's far more convenient to use this tool than to manually edit the diag_table
anyway.
Is there a CMIP7 format for these filenames we should go straight to ?
I think this would require always having a single file per variable which I don't thinks suits our use case (e.g. the ice data)?
The standard ACCESS-OM2 configs use one file per MOM5 variable, with a standard filename format that was carefully designed to be self-explanatory. As @dougie mentioned, the required diag_table
is generated by make_diag_table.py
to avoid all the tedious and error-prone boilerplate of doing it manually. Note that make_diag_table.py
can also put multiple variables in each file if preferred (this is done with scalars in ACCESS-OM2).
It would be nice to have something like make_diag_table.py
for CICE output, but I'm not sure how it could be done, since CICE doesn't use diag_table
. But if we're going to define a new output filename format we should at least make it similar to what we do for MOM6.
I was trying to tag Arnold Sullivan into this but he isnt linked to this github so I will have send him an email. Yes @aekiss you did switch to single files for access-om2 but we would prefer it back as a a single file for the existing CMIP6 post processing software package. I guess the software is still under discussion in the evaluation group, but it will be at that stage that standard CMIP7 files names which like CMIP6 will be very long to use will be set up.
CICE has an entirely different way of setting up it output fields we are talking about updating the ACCESS ESM1.6 to the CICE6.5 diagnostics (using some of the software from those routines ) so we can use the same post processing, but thats still to be actioned.
I will contact Arnold about the discussion,
OK thanks for the heads-up re. the possibility of single files. At this stage we are working out defaults for standard configs for development and eventually users of ACCESS-OM3. Things may need to be set up differently in the CM3 and ESM3 CMIP7 production configs.
@kieranricardo @martindix - is it worth trying to harmonize naming conventions with the UM / Cable ?
We try to push very hard for the community to use the CMORised data, not raw data.
CMIP table_id already has "frequency" tabs.
It is a good approach if the model output can be directly converted to CMIP format (cmorise). If we had APP5 to handle it, then that would be great. The primary purpose is to submit the data, right? It is not just for our local community to use.
Test: if we can use the new format to calculate the CMIP standard variable by using the new APP5 or other cmorise tool, then that is a great approach:
msftyrho,yes,ty_trans_rho ty_trans_rho_gm,"meridionalOverturning(var,'full')",kg s-1,dropX basin gridlat,,both,ocean, msftmrho,yes,ty_trans_rho ty_trans_rho_gm,"meridionalOverturning(var,'full')",kg s-1,dropX basin,,both,ocean, msftyz,yes,ty_trans ty_trans_gm ty_trans_submeso,"meridionalOverturning(var,'full')",kg s-1,dropX basin gridlat,,both,ocean, msftmz,yes,ty_trans ty_trans_gm ty_trans_submeso,"meridionalOverturning(var,'full')",kg s-1,dropX basin,,both,ocean,
sisnthick,yes,sisnthick,,m,,,CM2,seaIce, sispeed,yes,sispeed,,m/s,,,CM2,seaIce, sistrxdtop,yes,sistrxdtop,,N m-2,,down,CM2,seaIce, sistrydtop,yes,sistrydtop,,N m-2,,down,CM2,seaIce,
I think @aekiss is correct its probably going to need different approaches for different communities with the COSIMA community still using ''the cookbook' perhaps now though @dougiesquire prefers through the intake catalogue if the data is set up in that format. Whilst the CMIP7 data needs to be cmorized. The intake catalogue can handle cmorized data. I guess this will all be discussed more in the evaluation working group and in the wider community at the ACCESS-NRI workshop.
CMIP table_id already has "frequency" tabs.
In theory, we can hope that data is accessed through the intake catalogue, or through other tools (we know this isn't true ofcourse). Through the metadata in the source data, those tools will provide the frequency to the user. So to the end user the filename shouldn't matter. I think we are setting the filename as a convenience to developers (and maybe its used by intake).
sisnthick,yes,sisnthick,,m,,,CM2,seaIce, sispeed,yes,sispeed,,m/s,,,CM2,seaIce, sistrxdtop,yes,sistrxdtop,,N m-2,,down,CM2,seaIce, sistrydtop,yes,sistrydtop,,N m-2,,down,CM2,seaIce,
CICE6 has an option to enable output using the cmip variable names, so I think this will work for sea ice output in OM3. But I haven't tested it.
MOM6 can also output using CMOR names https://mom6.readthedocs.io/en/main/api/generated/pages/Diagnostics.html#apis-for-diagnostics
Many thanks @anton-seaice @aekiss,
I have no further questions. I believe that the MOM output with extra frequency can still be handled using the APP (CMORise package).
@aekiss, I think MOM6 CMOR format outputs should just for one-to-one variables, e.g., temp -> thetao, temp[:, :, :, 0] -> tos. For specific cases like hdfs, heat budget analysis, or different basins, we would need APP4 to handle it.
It would be great to ask the MED team (Romain) to double-check this. But at this stage, the ocean output in frequency format is fine.
ping @rbeucher
That would be great. Anything that can alleviate the need for loading multiple files would be good
@rbeucher once they introduce the new frequency for the ocean output, we then need to change APP4 and maybe MOPPER
elif realm == 'ocean':
if 'scalar' in axes_modifier:
file_structure='/ocn/ocean_scalar.nc-*'
elif freq == 'mon':
file_structure='/ocn/ocean_month.nc-*'
elif freq == 'yr':
if access_version.find('OM2') != -1:
if axes_modifier.find('mon2yr') != -1:
file_structure='/ocn/ocean_month.nc-*'
else:
file_structure='/ocn/ocean_budget.nc-*'
if exptoprocess == '025deg_jra55_iaf_omip2_cycle6':
axes_modifier='{} mon2yr'.format(axes_modifier)
else:
file_structure='/ocn/ocean_month.nc-*'
elif freq == 'fx':
#if access_version.find('OM2') != -1:
# file_structure='/ocn/ocean_grid.nc-*'
#else:
file_structure='/ocn/ocean_month.nc-*'
elif freq == 'day':
file_structure='/ocn/ocean_daily.nc-*'
else:
#Unknown ocean frequency
file_structure=None
Linking to related discussion: https://github.com/COSIMA/access-om3/issues/190#issuecomment-2251936713
@aidanheerdegen following our discussion this morning. I think we can change APP4 and MOPPER.
I think we've finialised a format - see here
@dougiesquire has suggested we include the frequency of output in the filenames of history/diagnostic output. For example, instead of
access-om3.cice.h.1900-01-01.nc
it could be
access-om3.cice.h.1900-01-01.day.nc
CMIP7 uses these names:
fx, subhr, hr, day, mon, yr, dec
This has a few advantages:
access-om3.cice.h.1900-01.day.nc
which is clearly different toaccess-om3.cice.h.1900-01.month.nc
At the same time we could harmonise how the dates are written ?
e.g. MOM uses "_" when cice uses "-" (
access-om3.mom6.h.native_1945_05.nc
)