Closed Kirill888 closed 1 year ago
Thanks for the awesome write up Kirill!
If we're going to store time as a range, we need to fix this. Ingestion shouldn't be throwing away data.
Can @jeremyh or anyone else remind me what the benefits are of storing time as a range. It increases complexity over storing time as a single value, so would be good to have documented justification.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Ingestion will be deprecated in Datacube v1.9 and removed in v2, this will not be fixed.
Expected behaviour
dc.Dataset
has a property.time
which is a time range covering the capture period from earliest pixel to the latest. Ingestion process generates one or more datasets containing parts or whole of the original dataset data reprojected according to theGridSpec
.I expect the
.time
property of the ingested datasets to be the same as input dataset.Actual behaviour
Time range of the ingested dataset is a single point, as in
ds.time[0] == ds.time[1]
and is set to the mid-point of the original dataset time interval.Steps to reproduce the behaviour
On NCI you can check that ingested datasets have a single point time range, even though they were ingested from data with a non-point time interval
Running this on NCI:
produces:
Where it's broken
Ingestor is using this function to create a new dataset object
https://github.com/opendatacube/datacube-core/blob/b6ca35143778aa5157d10247fe1645c0f9532961/datacube/model/utils.py#L176-L190
Notice how the only way to supply time information is via
center_time
parameter; internally it's copied intofrom_dt
,to_dt
,center_dt
properties of theextent
subtree of the metadata document.Instead this should take
time_range
, copied from the parent datasource, maybe with an optional convenience parametercenter_time
when time range is a single point in time.