Closed aekiss closed 3 months ago
It would be nice to modify make_diag_table.py
so that this behaviour can be set as an option via diag_table_source.yaml
.
Also, perhaps we can generalise by allowing file_name_separator
to be an array of alternative separators (in this case ['.', '']
) and specify when to apply which separator?
Even neater, how about supporting this yaml format, where file_name
can contain arrays whose names specify the separator to be applied before each element?
file_name: # file name (without trailing ".nc") is constructed from these components,
# separated by file_name_separator, as per https://github.com/COSIMA/access-om2/issues/185
# All these components (other than field_name) must be names that exist in global_defaults.
# If file_name_date or file_name_date_section are used, it must be the last item.
- "_":
- file_name_prefix
- file_name_dimension
- field_name # substituted by field name in diag_table section below
- output_freq
- "":
- output_freq_units
- "_":
- reduction_method
- file_name_date
This actually makes file_name_separator
redundant, but we should retain it for backward compatability. Perhaps the rule should be: string items in file_name
are separated by file_name_separator
, but keys whose values are arrays have the array elements separated by the array key as above. With this approach we could write
file_name_separator: "-" # used to separate filename components; best not to use "_" to avoid confusion with fields and dates
file_name: # file name (without trailing ".nc") is constructed from these components,
# separated by file_name_separator, as per https://github.com/COSIMA/access-om2/issues/185
# All these components (other than field_name) must be names that exist in global_defaults.
# If file_name_date or file_name_date_section are used, it must be the last item.
- file_name_prefix
- file_name_dimension
- field_name # substituted by field name in diag_table section below
- output_freq
- "":
- output_freq_units
- reduction_method
- file_name_date
meaning all elements except output_freq_units
are separated by file_name_separator
.
Thanks for your PR @minghang-li, sorry I didn't explain but I'm working on an implementation of the proposal here, which I think is easier for a user to understand.
Hi @aekiss, I think this PR also supports a single separator to maintain the backward compatability and a list of separators to ensure flexibility. It applies not just to output_freq_units
and output_freq
, but also addresses the aesthetic issue proposed in https://github.com/COSIMA/access-om2/issues/185#issuecomment-632428116
To clarify, what I meant to address the aesthetic issue, it is something like this, since we intend to remove ym
before the date.
ocean-3d-temp-1-monthly-mean-ymd_1958_10_16.nc
--> ocean-3d-temp-1monthly-mean_1958_10_16.nc
Thanks, but I think it's better interface design to specify optional custom separators via dict elements within the file_name
list as shown above, rather than as a separate file_name_separator
array of separators whose length needs to agree with the length of file_name
(which can be tricky if file_name_omit_empty
is true). I'll put together a PR to show you what I mean.
have a look at this PR https://github.com/COSIMA/make_diag_table/issues/7
Sometimes we want an alternative separator (e.g. no separator) between particular components - e.g. see https://github.com/COSIMA/access-om3/issues/190#issuecomment-2252030909 - how to implement this?