Off by one day temporal binning error

Datasets captured over east coast of Australia on 1-Jan-2017 get included into annual statistics for the year 2016.

Example dataset: https://explorer.dea.ga.gov.au/dataset/6b0472ff-6e1f-434f-8313-37a2ee183924

Effect of UTC dates on Statistical Product Time Boundaries

Datacube uses UTC timestamps
Landsat acquisitions over Australia sometimes happen around UTC midnight for paths 88-92
For these datasets "date component" of the timestamp is then different between UTC and localized timestamps
- date(utc) != solar_day(utc)
Datacube does not adjust time query to "local geographic time"
As a result when you query datacube for a given date on the East coast, you get
- Some of the observations that happened on this day
- Some observations that happened on the next day
- Also some observations that DID happen on this day (Australia time) are not returned

What this means for annual statistical products computed over Australia: some datasets that were captured on 1-Jan-Year (local time) will be counted towards Year-1.

Example python to run in sandbox:

from datacube import Datacube
from datetime import timedelta

dc = Datacube()
dss = dc.find_datasets(product='ga_ls8c_ard_3', time='2016-12-31')
dates_utc = set(ds.center_time.date() for ds in dss)
dates_actual = set((ds.center_time + timedelta(hours=10)).date() for ds in dss)
print(f"UTC: {dates_utc}")
print(f"Local: {dates_actual}")

Output:

UTC: {datetime.date(2016, 12, 31)}
Local: {datetime.date(2016, 12, 31), datetime.date(2017, 1, 1)}

Meaning that some datasets that were captured on 1-Jan-2017 are seen by datacube as captured on 31-Dec-2016 instead.

I feel that for stats products, when a user defines time period as 2016, what is meant is "local time 2016" and not "UTC 2016". For every output tile we should compute timezone offset and query database accordingly. To the best of my knowledge current stats code does not perform such correction for the time component of the query.

opendatacube / datacube-stats

Off by one day temporal binning error #193

Effect of UTC dates on Statistical Product Time Boundaries