GeoscienceAustralia / digitalearthau

Code and tools for Digital Earth Australia (a deployment of Open Data Cube)
https://geoscienceaustralia.github.io/digitalearthau/
31 stars 21 forks source link

Add cloud_cover to EO3 #274

Closed alexgleith closed 3 years ago

alexgleith commented 3 years ago

Doing datacube metadata update applies instantly and immediately makes this field queryable.

jeremyh commented 3 years ago

I'm not sure whether to default to indexed: false, though

codecov[bot] commented 3 years ago

Codecov Report

Merging #274 into develop will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop     #274   +/-   ##
========================================
  Coverage    67.10%   67.10%           
========================================
  Files           42       42           
  Lines         3219     3219           
========================================
  Hits          2160     2160           
  Misses        1059     1059           

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 395e438...ed08501. Read the comment docs.

alexgleith commented 3 years ago

I'm not sure whether to default to indexed: false, though

Since the rest of them are, I guess we should do that. I'll change it.

Kirill888 commented 3 years ago

My thinking on this

  1. My preference is to change code in datacube-core to make ad-hoc queries over the data possible. We pay all the costs of having json in sql but not using the flexibility offered by that.
  2. We do not understand the impact of creating all these indexes for every search field, we also don't understand how much of speed up these indexes offer at search time, as we almost always constrain by space and time first anyway. And remember the default is to create an index for every search field unless explicitly disabled.
  3. Metadata type is completely undocumented, so users are not enabled to develop their own, product specific metadata type. Even if it was documented it is still inconvenient to use due to indirection and specific sequencing of events to follow and very limited options to correct recent past mistakes other than start whole db from scratch.
  4. Realistically EO3 or EO is what people will use (see point above), so it's good for it to be complete, but not so good if that slows down database write operations by a lot. It's also not great for UI/UX, where fields that are never populated are listed as "available for searching".
  5. Better solution is to fix those issues in code
    • make it easy to search without editing metadata dc.load(..., SearchField('eo:cloud_cover', float) > 90)
    • make it possible to define common search fields in the product spec itself
    • maybe get rid of this whole metadata indirection layer altogether