geopython / GeoHealthCheck

Service Status and QoS Checker for OGC Web Services
https://geohealthcheck.org
MIT License
84 stars 71 forks source link

Showing resource lists in web app is very slow #391

Closed fsteggink closed 3 years ago

fsteggink commented 3 years ago

Describe the bug Pages like "Show all" and other resource types in the web app take a very long time to complete, especially when having a lot of resources and runs. Currently I'm testing GHC with a database containing nearly 1000 resources and 32,000 runs.

To Reproduce You'll need to have a lot of resources and lot of runs for this bug to take effect.

Expected Behavior Pages containing resources show up quickly.

Context (please complete the following information):

Additional context GHC does not create sufficient indexes when a database is initialized. This happens with both SQLite and PostgreSQL. When putting a couple of indexes manually, a page containing all resources is at least a factor 5 faster.

These are the queries to add indexes (works both in SQLite and PostgreSQL):

CREATE INDEX ix_run_resource on run(resource_identifier);
CREATE INDEX ix_probe_vars_resource on probe_vars(resource_identifier);
CREATE INDEX ix_check_vars_probe_vars on check_vars(probe_vars_identifier);
CREATE INDEX ix_resource_tags_tag on resource_tags(tag_id);
CREATE INDEX ix_resource_tags_resource on resource_tags(resource_identifier);
CREATE INDEX ix_resource_owner on resource(owner_identifier);

The real fix would be to let SQLAlchemy create the indexes automatically.

From what I've seen, the template to list resources (includes/resources_list.html) executes a lot of queries dynamically. This happens when the last resource is shown, or the reliability is calculated. The property resource.runs is created dynamically, which is a part of the problem. Especially when the list is retrieved multiple times when the page is being built. Further improvement to the page speed can possibly be made by caching the results. Not just "resource.runs" which still returns a Query object, but the actual results.

I've tried to get more accurate results by changing the views and templates, but because of the way how the page is being created (using Jinja2 template with a context with which dynamic queries are executed), it is substantially more work to check whether caching results in a better way is indeed the right way to get a better performance.

fsteggink commented 3 years ago

Note that having about 1000 resources and 32,000 runs still takes about 80 seconds locally for the "Show all" page to complete, with the indexes in place. I still think there's room for improvement.

Another thought: when the runs are queries multiple times when the page is being constructed, while runs are still being executed at a fast rate, the query sets which are returned are different. So, inconsistencies can occur in theory.

bart-v commented 3 years ago

Duplicate of https://github.com/geopython/GeoHealthCheck/issues/326#issuecomment-650059469 And many related issues reported there

Still very important to fix, indeed.

fsteggink commented 3 years ago

Yes, you're right. I'll dive into that issue and the related issues, and eventually close this bug.