BiologicalRecordsCentre / UKBMS-online

Issue tracking for UKBMS online recording site
2 stars 0 forks source link

Data Entry page very slow to load #215

Closed IanMiddlebrook closed 3 years ago

IanMiddlebrook commented 3 years ago

Hi @Gary-van-Breda

Several recorders have now commented that the second data entry page can be very slow to load up. I.E. after completing the Walk Data Entry details (time, weather etc) and clicking next - waiting for the species/section grid to load.

Sample feedback:

I've tried using the new site to enter week 1 results for two transects that I walk and have found that it was rather slow in moving on between pages especially between walk details and numbers of butterflies seen when I could have put the kettle on and had a cup of tea whilst waiting.

- Getting to P1 of Walk Data Entry: acceptable - Filling in that P1: no problem - Going from P1 to P2: very slow. Often it is timed out and you have to go to the My Walks page and restart. Interestingly, this happened mostly on the first time after logging in.

Thanks, Ian

DavidRoy commented 3 years ago

@Gary-van-Breda I agree that the transition fro P1 to P2 is slow

Gary-van-Breda commented 3 years ago

There are 2 reports which calculate the taxa previously entered by the user, and against the transect: as can be seen from the logs, these reports are very simple, and have a significant performance hit:

10:52:50 alert Report query took 71.432607889175 seconds.
10:52:50 alert D:/websites/warehouse/modules/brc_reports/reports/projects/ukbms/ukbms_taxon_meanings_for_user.xml
10:52:50 alert
select distinct o.taxon_meaning_id
from cache_occurrences_functional o

WHERE o.training=false AND o.website_id in (27)
AND o.created_by_id = 21161
AND o.survey_id = 64 
14:25:36 alert Report query took 23.628572940826 seconds.
14:25:36 alert D:/websites/warehouse/modules/brc_reports/reports/projects/ukbms/ukbms_taxon_meanings_at_transect.xml
14:25:36 alert
select distinct o.taxon_meaning_id
from cache_occurrences_functional o
JOIN samples sp ON sp.id=o.parent_sample_id AND sp.deleted=false

WHERE o.training=false AND o.website_id in (27)
AND sp.location_id = 2785
AND o.survey_id = 64 

These should be well served by the indexes: @johnvanbreda Do you know how often the table stats are updated? Is it possible that the indexes have got skewed and need rebuilding? Is the distinct messing it up? Am I missing something? I really can't replicate this locally, as when I run the same queries on my warehouse it takes milliseconds. (When I explain plan the first one, it goes in on ix_cache_occurrences_functional_created_by_id, and the second is similarly optimised).

johnvanbreda commented 3 years ago

I think the issue here is that, although there are indexes covering the filtering that is going on, each index is only able to reduce the number of records available so far, so the results need to be merged with the results of other indexes, or filtered. The first query does the following:

I've added the following indexes which I think will improve things a lot:

create index ix_cache_occurrences_functional_ukbms_opt_1 on cache_occurrences_functional(parent_sample_id) where survey_id=64;
create index ix_cache_occurrences_functional_ukbms_opt_2 on cache_occurrences_functional(created_by_id) where survey_id=64;

You could further simplify the filters by changing the report to not filter on website ID (since it has a survey ID filter).

Gary-van-Breda commented 3 years ago

I think it is still struggling.
I'm going to rewrite this to use the summary_occurrences table instead: this has got a factor of 10 fewer rows.

DavidRoy commented 3 years ago

@johnvanbreda @Gary-van-Breda could this use the ElasticSearch Index?

johnvanbreda commented 3 years ago

@DavidRoy yes, that's an option as well. Perhaps wait and see if the summary occurrences approach works, if not we can try Elasticsearch.

Gary-van-Breda commented 3 years ago

@DavidRoy @IanMiddlebrook I've changed the form to load the data from the summary occurrences table, rather than using a report.
Currently taking about 6 seconds to go from P1 to P2 for me, irrespective of whether the Indicia caches have been cleared or not. Each of those queries are now in the sub 1s time range. There are other parts of the form taking the most time now.

Deployed to Live.

@johnvanbreda : You can drop those additional indexes I think.

johnvanbreda commented 3 years ago

@Gary-van-Breda I've dropped the indexes.

DavidRoy commented 3 years ago

@IanMiddlebrook can you review performance again please

IanMiddlebrook commented 3 years ago

Hi @DavidRoy For me, currently ranging from 6-10 seconds to open up the second page of data entry, and a further 5-6 seconds before the 'out of range' or 'outside flight period' species are highlighted. Had an email from a co-ordinator this morning who says this page is just timing out for him. So I think it still needs some more work.

Thanks, Ian

Gary-van-Breda commented 3 years ago

@IanMiddlebrook : who had the problem this morning?

IanMiddlebrook commented 3 years ago

It was Bill Downey

Gary-van-Breda commented 3 years ago

There are various points where the performance of the form are logged: I've run it for Bill, and got the following:

Form Start 2021-04-12 11:32:45
(1) Start 2021-04-12 11:32:47
(5) Start 2021-04-12 11:32:47
(6) Start 2021-04-12 11:32:47
(10) Start 2021-04-12 11:32:48
(11) Start 2021-04-12 11:33:51
(9) Start 2021-04-12 11:33:54
(7) Start 2021-04-12 11:33:54
(12) Start 2021-04-12 11:33:54
Form End 2021-04-12 11:33:55

The killer here is (10) : the lookup of the taxa at the site, even using the summary_occurrences lookup.

@johnvanbreda : could you add the following index (the other 2 indexes on this table have other columns after the survey_id which compromise their usefullness for this use case):

CREATE INDEX ix_summary_occurrences_SLU ON summary_occurrences USING btree (survey_id, location_id, user_id);

johnvanbreda commented 3 years ago

I've added the index @Gary-van-Breda

Gary-van-Breda commented 3 years ago

@IanMiddlebrook : can you get Bill Downey to try again?

IanMiddlebrook commented 3 years ago

@Gary-van-Breda - have asked Bill, will let you know when he replies.

IanMiddlebrook commented 3 years ago

Thanks @Gary-van-Breda
Bill's response was generally positive.