Effect of UTC dates on Statistical Product Time Boundaries
Datacube uses UTC timestamps
Landsat acquisitions over Australia sometimes happen around UTC midnight for paths 88-92
For these datasets "date component" of the timestamp is then different between UTC and localized timestamps
date(utc) != solar_day(utc)
Datacube does not adjust time query to "local geographic time"
As a result when you query datacube for a given date on the East coast, you get
Some of the observations that happened on this day
Some observations that happened on the next day
Also some observations that DID happen on this day (Australia time) are not returned
What this means for annual statistical products computed over Australia: some datasets that were captured on 1-Jan-Year (local time) will be counted towards Year-1.
Example python to run in sandbox:
from datacube import Datacube
from datetime import timedelta
dc = Datacube()
dss = dc.find_datasets(product='ga_ls8c_ard_3', time='2016-12-31')
dates_utc = set(ds.center_time.date() for ds in dss)
dates_actual = set((ds.center_time + timedelta(hours=10)).date() for ds in dss)
print(f"UTC: {dates_utc}")
print(f"Local: {dates_actual}")
Meaning that some datasets that were captured on 1-Jan-2017 are seen by datacube as captured on 31-Dec-2016 instead.
I feel that for stats products, when a user defines time period as 2016, what is meant is "local time 2016" and not "UTC 2016". For every output tile we should compute timezone offset and query database accordingly. To the best of my knowledge current stats code does not perform such correction for the time component of the query.
Datasets captured over east coast of Australia on 1-Jan-2017 get included into annual statistics for the year 2016.
Example dataset: https://explorer.dea.ga.gov.au/dataset/6b0472ff-6e1f-434f-8313-37a2ee183924
Effect of UTC dates on Statistical Product Time Boundaries
date(utc) != solar_day(utc)
What this means for annual statistical products computed over Australia: some datasets that were captured on
1-Jan-Year
(local time) will be counted towardsYear-1
.Example python to run in sandbox:
Output:
Meaning that some datasets that were captured on 1-Jan-2017 are seen by datacube as captured on 31-Dec-2016 instead.
I feel that for stats products, when a user defines time period as 2016, what is meant is "local time 2016" and not "UTC 2016". For every output tile we should compute timezone offset and query database accordingly. To the best of my knowledge current stats code does not perform such correction for the time component of the query.