Open turley85 opened 3 weeks ago
That URL is working for me now. Is it still showing 504 for you?
Sorry, that URL was for the list.
It's the alert associated with that list that is failing... I can't get a URL directly to the alert itself to work sorry.
Looking at logs, these errors are resulting:
2024-10-24 08:09:33.812 ERROR --- [.1-8080-exec-26] au.org.ala.alerts.BiosecurityService : Server returned HTTP response code: 504 for URL: https://biocache.ala.org.au/ws/occurrences/search?q=%28genus%3A%22Alternanthera+philoxeroides%22%29+OR+%28species%3A%22Alternanthera+philoxeroides%22%29+OR+%28subspecies%3A%22Alternanthera+philoxeroides%22%29+OR+%28scientificName%3A%22Alternanthera+philoxeroides%22%29+OR+%28raw_scientificName%3A%22Alternanthera+philoxeroides%22%29&fq=-data_resource_uid%3A%22dr27665%22+AND+spatialObject%3A9433219+OR+spatialObject%3A9433227&fq=eventDate%3A%5B2024-05-23T14%3A00%3A00Z+TO+2024-10-23T21%3A08%3A33Z+%5D&fq=firstLoadedDate%3A%5B2024-10-20T13%3A00%3A00Z+TO+2024-10-23T21%3A08%3A33Z+%5D&pageSize=10000
Testing that URL manually, resulted in 504 Gateway Time-out
and not the usual SOLR error you see when the spatial_object is too long.
I'm guessing the spatial_object is still to blame (too complex) and resulting in SOLR timing out or running out of memory.
UPDATE: I think the fq
column might not be written correctly too. E.g. -data_resource_uid:"dr27665" AND spatialObject:9433219 OR spatialObject:9433227
- Boolean precedence means that the AND will take precedence over the OR, resulting in (effectively) (-data_resource_uid:"dr27665" AND spatialObject:9433219) OR spatialObject:9433227
. So it will (effectively) return all the results that match spatialObject:9433227
due to the last OR.
I think the intended result should use: -data_resource_uid:"dr27665" AND (spatialObject:9433219 OR spatialObject:9433227)
.
UPDATE 2: Reminder: the spatial object should be tested independently before using in a fq
.
https://biocache.ala.org.au/ws/occurrences/search?q=spatialObject:9433227
results in
{
message: "Error from server at null: Expected mime type application/octet-stream but got application/json. { "error":{ "metadata":[ "error-class","org.apache.solr.common.SolrException", "root-error-class","org.apache.solr.common.SolrException"], "msg":"application/x-www-form-urlencoded content length (74308530 bytes) exceeds upload limit of 32768 KB", "code":400}}",
errorType: "Query syntax invalid",
statusCode: 400
}
FYI, we advise that you do not combine spatialObject
in a fq
too. By combining 2 spatialObject's, you are in effect, causing the same error shown above (internally its like using one combined object).
Thanks for investigating Nick! Adding some other relevant background here, spatialObject:9433227 was one of the original shapefiles that was too complex and needed optimising. There's more info on ticket #246 but essentially there is an optimised version of it that I created, spatialObject:9439588. So that should be the one used in alerts.
Can be viewed and tested at: https://spatial.ala.org.au/?pid=9439588 https://biocache.ala.org.au/ws/occurrences/search?q=spatialObject:9439588
@kylie-m @nickdos I just updated that spatial object from spatialObject:9433227 to https://spatial.ala.org.au/?pid=9439588. However, the alert still failed.
I note that the list actually runs of 3 shapefiles, so is one of the other two causing this issue too? Or is having multiple shapefiles itself causing the issue?
Hi @turley85 - I saw this in the logs:
2024-10-24 12:56:59.648 ERROR --- [.1-8080-exec-30] au.org.ala.alerts.NotificationService : User or query not found for userId: null, queryId: BioSecurity alert for NSW_NPWS_Western_Weeds_list
userId: null,
So I think you had the page loaded from earlier and then clicked the "Preview" or "Notify" but your login had expired. So try reloading the page and see if you're prompted to login again. And then try running it again.
@nickdos Hmm, I just tried again. Closed the windows, logged out and then back into ALA and used "Preview" to test the alert again.
It failed again sorry :(
Let me know if there's something else I should have done to test!
I just tested https://biocache.ala.org.au/ws/occurrences/search?q=spatialObject:9433219 as well, so that spatialObject should be ok. I didn't spot a third one on the list though?
Same error again: 2024-10-24 12:56:59.648 ERROR --- [.1-8080-exec-30] au.org.ala.alerts.NotificationService : User or query not found for userId: null, queryId: BioSecurity alert for NSW_NPWS_Western_Weeds_list
.
Will look into it more.
Seems the timeouts are causing the DB to error (as described in other ticket), so the DB lookup for the query ID subsequently fails.
So fix is to remove all but one spatialObject
in the list fq
column, and re-try.
@turley85, we strongly recommend you take a copy of the list over to lists-test.ala.org.au and do the testing on alerts-test.ala.org.au., before making changes on production servers.
@nickdos would a good additional workaround here be to combine the 2 spatial layers into one layer in QGIS first? No guarantees but I can give that a try, have done so for other work previously
@nickdos would a good additional workaround here be to combine the 2 spatial layers into one layer in QGIS first?
@kylie-m - I think so. Combining spatialObjects adds an extra level of complexity and depending on how they are combined, could be worse than a single object. So simpler/safer to stick with a single spatialObject, as recommended by Adam.
Thanks Nick!
@turley85 I have merged the 2 layers in QGIS, then uploaded to Spatial portal.
In ala-test: resulting object: spatialObject:21643483 test: https://api.test.ala.org.au/occurrences/occurrences/search?q=Acacia%20longifolia&fq=spatialObject%3A21643483 test in UI to view on map (will be quite slow): https://biocache-test.ala.org.au/occurrences/search?q=taxa&fq=spatialObject:21643483
is returning records within the new spatial object above, though the equivalent alert on test is not yet working - I'll keep trying, but @nickdos if you have any ideas, let me know! (https://lists-test.ala.org.au/speciesListItem/list/dr22890)
In production: resulting object: spatialObject:9478102 test: https://biocache.ala.org.au/ws/occurrences/search?q=spatialObject:9478102
Alert is working for this test list: https://lists.ala.org.au/speciesListItem/list/dr28737 (Alert name: "wattle")
Other Docs:
Link to the resulting shapefile, so it can be saved in with the others - https://csiroau-my.sharepoint.com/:u:/g/personal/mor742_csiro_au/EWeLVdhQkeJAo9S30seW090BbnAjz80xQaBsf7feVv0wnQ?e=fo9BK4
steps to merge 2 shapefiles in QGIS: add both shapefiles, Data management tools > Merge Vector Layers > select both layers, Run. Then check the output matches, export as shapefile. I have now added these to the Shapefile Optimisation doc on Confluence.
I have also added a note about not using more than one shapefile per fq query in our Biosecurity Alerts Workflow doc on Confluence.
As Nick mentioned, it's good practice to load and run in test first, even though we can technically use production lists privately. I'll add that to our workflow docs too.
@kylie-m @nickdos I've updated the NPWS Western list with spatialObject:9478102 in production and still getting 504 Gateway timeout sorry.
However, I did replicate @kylie-m's result with the wattle list:
76 new records for wattle, dr28737 since 16 Oct 2024
hmm I wonder if the spatialObject is too complex when in combination with a more complex query, but just ok with a simpler query, @nickdos ?
@kylie-m I wondered the same thing - I think the additional terms for the OR'ed names might be pushing us over some threshold value. Only way to know is run the alert and look at logs, I think.
503 gateway timeout
is usually an indicator, the biocache requests are timing out or erroring.
I'm getting a 504 Gateway Time-out the previewing the alert:
BioSecurity alert for NSW_NPWS_Western_Weeds_list