Closed DFEvans closed 2 years ago
Thanks for opening this issue. Do you have notes on when the errors occurred? Earlier this morning (roughly 8 hours ago) there was some downtime from a data ingest causing a deadlock in the database. That should be fixed now.
The Status page claims that "STAC API: Search" is operational, and no incidents are noted: https://planetarycomputer-status.microsoft.com/
Yes, we'll need some finer grained / more comprehensive health checks there. The API application was health. It was the database having issues, so that actually doing a search failed.
It was out pretty consistently unavailable around 1100GMT-1400GMT, and then somewhat intermittently from then until 1700GMT. That first part looks like it more or less lines up with that data ingest issue.
Is it expected that this will occur whenever a data ingestion occurs, or was this something that hadn't gone quite right? (I can't quite work out whether "fixed" meant "ingestion done" or "database issue fixed")
If it's useful to you, I can come back with times if it occurs again - although my use is also intermittent, so I won't promise to be too accurate!
Is it expected that this will occur whenever a data ingestion occurs, or was this something that hadn't gone quite right?
A bit of both, unfortunately. We're working through a backlog of items to ingest so volumes are a bit larger than normal. But we're working develop a fix to avoid the conditions causing the deadlock in the first place.
We've made a few changes that should have fixed the worst of the timeouts.
In addition, https://github.com/microsoft/planetary-computer-apis/pull/52 implemented some more caching and rate-limiting, which should help with a second source of the timeouts. That will be deployed in our release next month.
Over the last week, I've been seeing intermittent outages of the Planetary Computer STAC API Search Endpoint. Requests return the Error 500, with the following body:
This doesn't seem to depend on access method, or on the content of the request - e.g. even attempting to visit the search API with no parameters via my browser times out: https://planetarycomputer.microsoft.com/api/stac/v1/search?limit=250
The Status page claims that "STAC API: Search" is operational, and no incidents are noted: https://planetarycomputer-status.microsoft.com/