Closed IanMiddlebrook closed 6 months ago
@johnvanbreda are you able to pick this one up? Bounce to Andy or Gary if appropriate, and involve Jim/Biren as required
@DavidRoy I don't have an admin login for UKBMS anymore. Rather than increase memory limits, is switching to Elasticsearch for downloads going to be feasible?
@johnvanbreda I've given you admin rights. I can't remember the reason but these reports were not converted to ElasticSearch. But I'd be happy for the work to be done if won't take too long. There is some urgency for Ian to have this download
Thanks David. @IanMiddlebrook please can you confirm which page you are trying to download from as I'm a little unfamiliar with the facilities currently on the site.
Hi @johnvanbreda
It's the Annual Summary page (Reporting menu) - load filters to 'combine data for all recorders' and 'combine data for all sites'. Then the two downloads I would need are Samples and Occurrences. Prior to the recent server work, I've been able to download these OK.
Thanks, Ian
@johnvanbreda this page: https://ukbms.org/annual-summary
reports are these I think but they have long urls.
projects/ukbms/ukbms_sample_download.xml
projects/ukbms/ukbms_occurrence_download_confidential.xml
I wonder if there is a better way of delivering these downloads for Ian who needs all the data for a given year (or even across years). The report files could be simpler by removing filter parameters? The download files could also be made available via a separate downloads page as we do for the eBMS - https://butterfly-monitoring.net/downloads
@DavidRoy The issue here is the quantity of data being loaded into server RAM whilst building the download. There are a few options - reviewing the code to page through the data more efficiently or switching to the REST API which already does this, or switching completely to use Elasticsearch. However the quickest fix is going to be an alteration to the server configuration to allow bigger files - I've asked Damian to do this and he plans to look at it later today.
@IanMiddlebrook the memory limit has been increased and the reports seem OK to me at the moment. Please can you try them again and let me know how you get on?
Hi @johnvanbreda I've managed to download all the WCBS-BBS and WCBS-BC data for 2023. But for Transect data, I'm still getting exactly the same fatal error message (same allowed memory size), for both Samples and Occurrences.
Hi @johnvanbreda
I've had a message from a branch co-ordinator over the weekend - they're getting the same error message trying to get the 'Section Level' download for their branch data.
Thanks, Ian
@IanMiddlebrook I've sent you an email containing the 2023 data as a temporary solution - let me know if not received.
thanks @johnvanbreda - all received
Hi @johnvanbreda
Thanks for the downloads as a temporary fix for myself, but there are still issues for Co-ordinators of some larger branches being unable to download their data at a critical time of year. Is the memory limit still being investigated?
Thanks, Ian
Hi @IanMiddlebrook, please can you find out which branch coordinator and exactly what the settings where when they tried a section level download?
Hi @johnvanbreda
It was Bob Annell - the account would be 'Hampshire UKMBS' (user 3599) and he would set up filter by recorder to 'Branch Data' and filter by site to 'Combine data for all my branch sites'.
Thanks @IanMiddlebrook. I have a new version of the code coming which will reduce the memory usage during a download. I'll let you know when ready.
@IanMiddlebrook I've deployed the memory fix and successfully downloaded the section level branch data for Hampshire. Let me know if you have any more problems but the change should significantly reduce the memory consumption so hopefully it will work for all the downloads again.
Thanks @johnvanbreda
I'll let the The Hampshire Co-ordinator know - I also managed that download.
I've now been able to download the 'Samples' for all transect data, but the big one - the Occurrences - was not successful. I thought it was, as it produced a CSV file, but that file only contained the error message:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 65028096 bytes) in /srv/sites/warehouse1.indicia.org.uk/modules/indicia_svc_data/controllers/data_service_base.php on line 423
Hi @IanMiddlebrook, I've found a further memory optimisation and I've now managed to download the occurrences file. Please can you let me know if that works now?
That's great thanks @johnvanbreda - I've managed to download all occurrences
For some reason there are three additional/superfluous columns in that download now, without headings - between the Date and the Section No.
Not a problem as I can delete them, but looks a bit odd.
Thanks, Ian
@IanMiddlebrook I did have to fiddle with the way that dates are added into the report when optimising the memory usage and there was a mistake in the logic which replaced the constituent date fields (date start, end etc) with a single date string. Now fixed so the extra fields should be removed from the report.
Please close if now OK.
That's great - thanks @johnvanbreda
Hi @DavidRoy,
We don't seem to be able to download the full data any more. Even for the Samples download which is not the biggest file (just the list of walks) I'm getting this message:
Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 20480 bytes) in /srv/sites/warehouse1.indicia.org.uk/system/libraries/drivers/Database/Pgsql.php on line 452
I think the memory limit needs to be increased?
This will certainly be a big issue now we're coming to validate the full set for 2023.
Thanks, Ian