openaq / openaq-api-v2

OpenAQ API
https://api.openaq.org
43 stars 9 forks source link

`v2/locations` frequently returning 500s #349

Closed JSaja closed 5 months ago

JSaja commented 5 months ago

Issue Summary: On 4/12 @ 21Z, the v2/locations endpoint suddenly started returning 500s. This happened when a full scan of locations was performed by iterating over pages 1000 items at a time. However, upon further investigation, not every query errors. This could suggest that one explanation for this could be due to issues with fetches for particular records.

Screenshot below of respective lambda function confirming the timing of the issue: image

Reproducing the Issue: The most deterministic way to reproduce based on the available order by fields (assuming no new stations are added):

Bizarrely, the 18th record is able to be searched, which would seem to imply that the invalid query should also work.

Unknowns: It’s difficult to tell whether there is some particular group of records affected based on the fields available for ordering. Ideally an id field would be useful for client-side troubleshooting here. Server logs might be more helpful here to determine if there’s any similarities of the records that are failing.

Resolution: The issue would be considered solved if a full-scan of locations can be performed without receiving a 500 error.

russbiggs commented 5 months ago

Thanks for reporting this. We'll take a look and see whats going on.