BiologicalRecordsCentre / UKBMS-online

Issue tracking for UKBMS online recording site
2 stars 0 forks source link

Data Entry Species lists not loading #314

Closed IanMiddlebrook closed 1 year ago

IanMiddlebrook commented 1 year ago

Hi @DavidRoy @Gary-van-Breda

Following the latest updates, there is a problem with the data entry form - Butterfly lists for 'Species known at this site' and 'Species I have recorded' are not loading, leaving the form blank for most recorders. Likewise, the lists for Moths, Odonata and Others are all blank, where they should be pre-loaded with all species previously recorded on that site.

Thanks, Ian

DavidRoy commented 1 year ago

@JimBacon could you pick this up please

JimBacon commented 1 year ago

Hi @IanMiddlebrook

We have identified a problem and need to rebuild a summary table in the database. It is quite a time consuming process and we don't want to overload the server by trying to do it all at once. We are also waiting for another maintenance task to complete before we start. Users should see a gradual recovery over the next few days.

While this is an inconvenience, Recording is still possible by adding species to the list manually.

IanMiddlebrook commented 1 year ago

Thanks @JimBacon

Can I assume this table also affects the provision of data for the Year by Year Index Plot - with Data filter set to 'Count' - This is also returning very little (beyond data entered today!).

Best wishes, Ian

JimBacon commented 1 year ago

Very likely affects any annual summary type of output. The problem is a table of summary data by taxon, year, and location. The table has today's data but seemed to forget almost everything else it knew during some ructions with the server yesterday.

IanMiddlebrook commented 1 year ago

Thanks @JimBacon - is there a similar table that services 'Species I have recorded' which would based on User rather than location - or is it all in the same summary table?

Gary-van-Breda commented 1 year ago

The "species I have recorded", "species on a location", the year by year plot, and the annual summary all use the same table, so will all have this problem.

JimBacon commented 1 year ago

Gary has completed the rebuild of the faulty database table. @IanMiddlebrook, if everything looks good to you, please close this issue.

Note from email correspondence: the query used to repair the table was like

INSERT INTO work_queue (task, entity, record_id, priority, cost_estimate, created_on)
SELECT 'task_summary_builder_sample', 'sample', min(id), 2, 50, now()
       FROM indicia.samples
       WHERE deleted=false
       AND location_id is not null AND parent_id is NULL AND
       survey_id IN (SELECT survey_id FROM indicia.summariser_definitions WHERE deleted = false)
       GROUP BY survey_id, location_id, created_by_id, EXTRACT(year FROM date_start)
IanMiddlebrook commented 1 year ago

Hi @JimBacon @Gary-van-Breda Just looking at a couple of sites I'm involved with, I can see that there is still a lot missing.

Wyke Regis (id=243559) - my 2023 Annual Summary table only shows species I've recorded in the last two visits - Brimstone, Comma, Green-veined White, Orange Tip, Peacock, Small Copper, Small Tortoiseshell and Wall Brown all missing - and that's just this year. 2022 data only shows counts for Comma and Speckled Wood - everything else missing. If I go to data entry page, the list for 'species known at this site' is certainly incomplete.

Looking at the Year by Year Index plot (Data = Count) for Lulworth Lake (id=2785) I can see several years with data completely missing: image The totals for each year should be more like (starting at 1999): 366 | 403 | 225 | 192 | 622 | 480 | 367 | 837 | 397 | 343 | 440 | 240 | 210 | 250 | 407 | 483 | 488 | 358 | 486 | 685 | 731 | 464 | 145 | 222

JimBacon commented 1 year ago

Thanks, Ian, I'll take a look. I'm holding out a small hope that you are getting some out of date information from a cache.

JimBacon commented 1 year ago

I can confirm the table is still not complete. We have a further update to run which will fill in the gaps over the next day or so.

The planned query now looks like

INSERT INTO work_queue (task, entity, record_id, priority, cost_estimate, created_on)
SELECT DISTINCT 'task_summary_builder_sample', 'sample', id, 2, 50, now()
FROM (
    SELECT min(s.id) as id
    FROM indicia.samples s 
    JOIN indicia.samples c ON c.parent_id = s.id AND c.deleted = false
    JOIN indicia.occurrences o ON o.deleted=false AND o.sample_id = c.id 
    LEFT JOIN indicia.summary_occurrences so
        ON so.user_id = s.created_by_id 
        AND so.survey_id = s.survey_id 
        AND so.location_id = s.location_id
        AND year = EXTRACT(year FROM s.date_start) 
        AND so.taxa_taxon_list_id = o.taxa_taxon_list_id
    WHERE s.deleted=false 
        AND EXTRACT(year FROM s.date_start) = 2023
        AND s.location_id is not null AND s.parent_id is NULL 
        AND s.survey_id IN (SELECT survey_id FROM indicia.summariser_definitions WHERE deleted = false)
        AND so.survey_id is NULL
    GROUP BY s.survey_id, s.location_id, s.created_by_id, EXTRACT(year FROM s.date_start), o.taxa_taxon_list_id
) AS s1
JimBacon commented 1 year ago

@IanMiddlebrook, Gary has completed the further update of the database. Things are looking better. Could you repeat your checks to see if it looks okay to you now.

IanMiddlebrook commented 1 year ago

Thanks @JimBacon @Gary-van-Breda Looking at data for Lulworth Lake (id=2785), I think the summary plots and tables now match the data in the downloads.

There are a few discrepancies between those data and the master UKBMS database, but I've a feeling they might be the relict of a previous incomplete data upload? Specifically, 2009 data, there are no butterflies displayed for 10 walks from 29th June onwards. And 2011, there are an additional 7 occurrences in the master UKBMS database over the 132 occurrences shown online. There is no obvious pattern to these - different dates/species/sections:

Date | Section No. | Species | Count 28/04/2011 | S5 | Small White | 2 16/05/2011 | S5 | Speckled Wood | 1 07/06/2011 | S6 | Small Tortoiseshell | 3 30/06/2011 | S6 | Meadow Brown | 6 13/07/2011 | S3 | Meadow Brown | 1 13/07/2011 | S5 | Large Skipper | 1 01/08/2011 | S7 | Meadow Brown | 1

If these are not a result of the summary table rebuild, then I'm happy to close this issue.

Gary-van-Breda commented 1 year ago

@IanMiddlebrook : the Lulworth Lake data you've highlighted is missing completely from the warehouse, not just the summary table - there are no occurrences for those 2011 species/Sections/dates and the 2009 data is also completely missing any occurrences from29th June onwards. Given that these are all data loads, I think that is where the issue lies, not with the summary table rebuild.

IanMiddlebrook commented 1 year ago

Thanks @Gary-van-Breda , I thought that would be the case.

Happy to close this one.