opendatacube / datacube-core

Open Data Cube analyses continental scale Earth Observation data through time
http://www.opendatacube.org
Apache License 2.0
513 stars 177 forks source link

Ingest collapses time range to a single point #420

Closed Kirill888 closed 1 year ago

Kirill888 commented 6 years ago

Expected behaviour

dc.Dataset has a property .time which is a time range covering the capture period from earliest pixel to the latest. Ingestion process generates one or more datasets containing parts or whole of the original dataset data reprojected according to the GridSpec.

I expect the .time property of the ingested datasets to be the same as input dataset.

Actual behaviour

Time range of the ingested dataset is a single point, as in ds.time[0] == ds.time[1] and is set to the mid-point of the original dataset time interval.

Steps to reproduce the behaviour

On NCI you can check that ingested datasets have a single point time range, even though they were ingested from data with a non-point time interval

Running this on NCI:

import datacube
dc = datacube.Datacube()
ds = dc.index.datasets.get('e999002e-71c6-46ee-9032-ad94478926e9', include_sources=True)
print('ingested:', ds.time)
print('original:', ds.sources['0'].time)

produces:

ingested: Range(begin=datetime.datetime(2018, 2, 1, 0, 7, 7), end=datetime.datetime(2018, 2, 1, 0, 7, 7))
original: Range(begin=datetime.datetime(2018, 2, 1, 0, 6, 51), end=datetime.datetime(2018, 2, 1, 0, 7, 23))

Where it's broken

Ingestor is using this function to create a new dataset object

https://github.com/opendatacube/datacube-core/blob/b6ca35143778aa5157d10247fe1645c0f9532961/datacube/model/utils.py#L176-L190

Notice how the only way to supply time information is via center_time parameter; internally it's copied into from_dt,to_dt, center_dt properties of the extent subtree of the metadata document.

Instead this should take time_range, copied from the parent datasource, maybe with an optional convenience parameter center_time when time range is a single point in time.

omad commented 6 years ago

Thanks for the awesome write up Kirill!

If we're going to store time as a range, we need to fix this. Ingestion shouldn't be throwing away data.

Can @jeremyh or anyone else remind me what the benefits are of storing time as a range. It increases complexity over storing time as a single value, so would be good to have documented justification.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

omad commented 1 year ago

Ingestion will be deprecated in Datacube v1.9 and removed in v2, this will not be fixed.