GeoscienceAustralia / digitalearthau

Code and tools for Digital Earth Australia (a deployment of Open Data Cube)
https://geoscienceaustralia.github.io/digitalearthau/
31 stars 21 forks source link

Collection 3 WO data needs reindexing from AWS #326

Open omad opened 1 year ago

omad commented 1 year ago

The Problem

While looking for sample data for https://github.com/opendatacube/datacube-explorer/pull/487 , I noticed that some ga_ls_wo_3 data indexed in the Production AWS data is missing thumbnail information.

This dataset was indexed from STAC, but does not include the accessory file information

image

Investigation

We've seen similar issues previously when metadata was regenerated to S3 but not reindexed. However, checking the object time on S3:

image

Against the indexed time in the database (via Explorer):

image

Shows that the data was indexed after the S3 metadata was written. So that is not the error in this case.

Comparing a WO dataset that was created and indexed in 2022, shows what things should look like:

image

Resolution

What I actually think is going on, is that an old version of odc-tools was used to index the STAC document, and it didn't handle all the accessory file information stored in the metadata documents.

Fix

  1. We need to re-index all the C3 WO data from S3.
  2. We need to check for similar problems in our other datasets.
robbibt commented 1 year ago

@omad @jmettes Does this also apply to our C3 Landsat data as well? For example, I noticed that products like these don't render thumbnails on the map like other products:

https://explorer.dev.dea.ga.gov.au/products/ga_ls7e_ard_3/datasets/5632539a-31b0-4171-b540-60d5b9625056 https://explorer.dev.dea.ga.gov.au/products/ga_ls5t_ard_3/datasets/a4dcf08f-19bc-4b0a-8992-0e853582a9fa https://explorer.dev.dea.ga.gov.au/products/ga_ls8c_ard_3/datasets/ca141a2d-487f-4a8d-8001-8c9ee59c437d

And they seem to be missing a "thumbail" layer under the "Location" heading too: image

robbibt commented 1 year ago

Landsat 9 seems OK: https://explorer.dev.dea.ga.gov.au/products/ga_ls9c_ard_3/datasets/e03aeee8-2c75-49f7-9fbe-edc8d40652ee

image

omad commented 1 year ago

Yep, those ARD will require re-indexing too.

The issue is with an older version of odc-tools, which did a poor job of indexing from STAC. It was fixed somewhere mid 2021, so any datasets indexed from STAC before then will be missing thumbnail information.