AtlasOfLivingAustralia / biocache-service

Occurrence & mapping webservices
https://biocache-ws.ala.org.au/ws/
Other
9 stars 26 forks source link

Failed to get json from webservice #872

Closed aainsa closed 2 months ago

aainsa commented 7 months ago

We have an error on our French ALA portal (OpenObs) that occurs frequently in production. The detail pages and the search result display this error after a one-minute waiting time:

When looking at the logs of the BiocacheService, we find this error:

- Servlet.service() for servlet [mainDispatcher] in context with path [/biocache-service] threw exception [java.lang.IllegalStateException: getOutputStream() has already been called for this response] with root cause

When this occurs, we restart the biocacheservice Docker image to eliminate the error. We managed to reproduce the issue by making multiple requests on the site within a short period. We are using the version 3.0.24 of the BiocacheService. Have you ever encountered this error, or do you know where it might be coming from ?

adam-collins commented 7 months ago

Getting class java.net.SocketTimeoutException Read timed out after 60s for a call to https://openobs.mnhn.fr/biocache-service/occurrences/search? suggests a SOLR performance issue.

getOutputStream() has already been called for this response for a call to https://openobs.mnhn.fr/biocache-service/occurrences/search? suggests to me there is a bug with au.org.ala.biocache.web.CustomExceptionResolver. This bug may prevent logging of the actual cause.

With the information provided I would begin by looking for ZooKeeper connection drops and SOLR server load and memory. Logging SOLR GC has helped us identifying issues in the past.

adam-collins commented 2 months ago

I am assuming this is not longer an issue.