Open rahulbot opened 9 months ago
This is a remnant of when the API lived in IA's servers- I agree, there's a better pattern. I'll address
Indeed, this change surfaced a new error in the testing circuit to chase down
Ok, so- the 404 errors are returned by the api when the query returns nothing- we're using 404 parsed as 'not found'- that was why we didn't attempt to catch them earlier. I believe that the behavior you described here is probably actually because those specific chunks happened to return nothing- I would expect any other issue to surface as a 500 error.
At the very least, changing this means putting some other verbiage in place to catch 404 specifically, since that's how the api returns "0"- but really I'm thinking that supressing 404 should be the intended behavior here.
Yeah. Super confusing to me. Anyway perhaps safest to do list of valid response codes and raise error if not in list?
Note: non-urgent because I was wrong and the current behavior is not a bug. Proposed enhancement is to have a safe-list of valid response codes (if r.status_code not in [200, 404]:
?).
I'm tracking down a weird bug on a big query, running search locally connected via tunnel to prod ES index. I'm seeing a log message that seems to indicate the query is being chunked into lots of chunks, and then some of them are returning 404 but not throwing a real error.
I don't know from where, but log message I see look like this:
Note the 404 on the last one. While tracking this down I noted: https://github.com/mediacloud/mediacloud-news-client/blob/e153023be15648852c0cff3e2c59399e8b99ec8a/mcnews/searchapi.py#L156-L158
Why is this
>=500
? When I change it to!=200
locally I get a much more helpful error. I'm worried this could be swallowing 40x errors and returning incorrect data due to the query chunking in place.