Closed forman closed 3 years ago
This wouldn't work for the CMEMS CHL data: its metadata looks like this:
start_date: 2020-04-04
stop_date: 2020-04-04
start_time: 08:00:00 UTC
stop_time: 14:00:00 UTC
I haven't tested the PR with these datasets yet, but as far as I can see from the source it would ignore the _date
keys and overwrite the _time
keys with %Y-%m-%d %H:%M:%S
-formatted strings.
Also, what about spatial metadata? Mentioned in #20 but not addressed in this PR as far as I can tell.
This wouldn't work for the CMEMS CHL data: its metadata looks like this:
start_date: 2020-04-04 stop_date: 2020-04-04 start_time: 08:00:00 UTC stop_time: 14:00:00 UTC
I haven't tested the PR with these datasets yet, but as far as I can see from the source it would ignore the
_date
keys and overwrite the_time
keys with%Y-%m-%d %H:%M:%S
-formatted strings.Also, what about spatial metadata? Mentioned in #20 but not addressed in this PR as far as I can tell.
Ok, this is very special, I hate that :) If seen start_time, stop_time pairs often (but then containing datetime value) and your case only once (CMEMS CHL). I then suggest to not touch start_time, stop_time at all and update those with a postprocessor, or maybe a new custom finalizer.
If seen start_time, stop_time pairs often (but then containing datetime value) and your case only once (CMEMS CHL).
This is the case for all 10 CHL datasets on Creodias -- possible for all 60 CMEMS datasets, but I haven't checked. Then again, that in itself isn't a strong argument against using a custom processor, since the processor only needs to be written once. So I have nothing against externalizing the CMEMS fix-up into a processor, especially since there are probably other unusual cases out there which will have to be handled specially too.
We'll need to implement a new processor hook (separate issue/PR of course). The processor function needs simultaneous access to both the existing dataset and the to-be-appended dataset, which none of the current ones get as far as I know.
We'll need to implement a new processor hook (separate issue/PR of course).
Yes.
The processor function needs simultaneous access to both the existing dataset and the to-be-appended dataset,
Why do you need both?
Why do you need both?
Hmm, actually perhaps not necessary if we're taking the times from the data itself, but I (and the CloudFerros) would also like to be able to calculate from the existing metadata, e.g. updated_start_date = min(old_dataset.attrs.start_date, new_dataset.attrs.start_date)
.
Closes #20 Closes #34
Whether to adjust output metadata (global attributes) after the last write/append can now be forced by the new
output.adjust_metadata
setting whose default isfalse
. If set totrue
, this will adjust the following metadata attributesIn addition, extra metadata (global attributes) can now be added after the last write/append using the new setting
output.metadata
whose value is a mapping from attribute names to values.The above functionality is also reflected in two new CLI options
--adjust-metadata
and--finalize-only
. In combination, they can be used to later adjust metadata in already existing Zarr datasets.