Open omad opened 6 years ago
Version: 1.7+0.g98cf9ba3.dirty
Config files: <ODC config>
Host: <OWS-DEV-RDS>:5432
Database: ows
User: ows
Environment: None
Index Driver: default
Valid connection: YES
Still present in DEA-dev environment. Adding new products is slow. Killing the datacube add process and retrying makes the operation instant.
@whatnick what you describe simply skips index generation step making future queries really slow, particularly on larger databases.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Bump!!
Here is a benchmark
time datacube product add https://raw.githubusercontent.com/GeoscienceAustralia/digitalearthau/develop/digitalearthau/config/products/cemp_insar_radarsat2_displacement.yaml
Adding "cemp_insar_radarsat2_displacement" (this might take a while) DONE
real 13m50.068s
user 0m1.393s
sys 0m0.254s
In my case, adding a product to an already populated with $200\times10^6$ Datasets takes well over 20-30 minutes to complete. Exploring the running SQL statement in pgAdmin shows that Postgres is creating an INDEX for that product. The effects of this INDEX are impressive. If querying by dataset label takes about 1 min to return the unique result, querying by product and label takes only 2-3 sec!
This is due to design decisions made with the current PostgreSQL database driver, where it creates multiple partial indexes when adding a new product, which scales in line with the number of indexed datasets.
This will be greatly improved, hopefully to the point of being a non issue with the new PostGIS driver coming in v1.9 or v2.
Expected behaviour
I should be able to add a new product to a Data Cube index and not have to wait.
Actual behaviour
It's very slow (tens of minutes) to add each new product to the production DEA Database.
Steps to reproduce
As an admin on a large Data Cube Index, run:
datacube product add <product-defn.yaml>
Environment information
Which
datacube --version
are you using?Open Data Cube core, version 1.6rc2+108.g096a26d5
What datacube deployment/enviornment are you running against? Database Server:
agdc-db.nci.org.au
NCI Module:dea/20180815