Open alexdunnjpl opened 11 months ago
Looks like this may only be a problem for the _bulk
API
https://github.com/elastic/elasticsearch/issues/41434
Actually no - it will happen in other contexts too (this is likely what @al-niessner was talking about - are CCS remotes equivalent to shards in this context?) https://github.com/elastic/elasticsearch/issues/18978
@jordanpadams
There's a really good chance this would be available as a side-effect of #12
Checked for duplicates
No - I haven't checked
🧑🔬 User Persona(s)
dev
💪 Motivation
So that we can rely on the correctness of sweeper code and avoid (for example) long, repeated, looping queries.
📖 Additional Details
Opensearch (v1, at least), returns HTTP200 for at-least-some types of request failures, including bulk updates.
With respect to the looping/hanging queries, I suspect that's due to expiration of scrolls (confirm whether scroll parameter is per-scroll, or per-query/lifetime). This should be easy to test by dropping the scroll window to 1min and adding 30sec delays. The second or third request should return HTTP200, but an error-indicative response, if it's the root cause of the 86% stall (see other issue, need to track it down).
Acceptance Criteria
All request failure conditionals test not only the HTTP status code, but also examine the structure of the returned response.
Ideally, this should be enclosed in a wrapper function.
⚙️ Engineering Details
No response