Closed tvernum closed 4 years ago
Pinging @elastic/es-security (:Security/Authentication)
Do we already return 503
in some other scenarios? Sometimes it gets used as a default status code when something goes wrong on the server side, for an example, many servelet containers do that.
So if we are returning 503
for other error, I feel it would still be necessary for downstream clients to parse the error message to be sure that the error is about ES configuration.
Also 503
seems to indicate a temporary failure (server overloaded) which is retriable from the client side. But it is not really retriable until the configuration is fixed and node restarted (correct me if I am wrong here). So 501
might be a better candidate.
With above being said, I do also reckon that by just looking at the short texts of the codes, Service Unavailable
(503) reads better than Not Implemented
(501) for this use case.
I’m having a hard time seeing this as a 5xx. The client sent a request to a server that wasn’t configured to support the request. Is that a server error (we messed up)? I don’t think so. It’s a client/administrator error.
When it comes to 501, it means that we don’t support the functionality, but that’s not right, it just wasn’t configured. It’s not that we’re incapable of performing the request, we’re not configured to perform the request. Also, this status code is cacheable, so even if an administrator fixes the issue, clients through some proxies could still error out. That seems bad.
When it comes to 503, we abuse it a bit today, but it’s meant to be reserved for temporary service issues. That’s not the case here, as the administrator might elect to never enable functionality.
I lean towards 4xx, that the client made an error. We have some prior art here in the use of 405 when a user sends a method that an endpoint doesn’t support. I don’t think that’s appropriate here, but only point it out to draw attention to the tension between 405 and 501 in terms of there being a client error indicating status code for “not implemented”.
I think a generic 400 is appropriate.
We discussed this and decided that (apart from the debate about the best status code), a status code was not really the best way to communicate this information.
I have opened https://github.com/elastic/elasticsearch/issues/52311 instead.
See: https://github.com/elastic/kibana/pull/55255
If the token service and api key services are disabled, then usage of those APIs will fail, but the only way to determine that it is caused by ES configuration is to parse the error message.
We would prefer not to have clients that depend on the error message.
From my reading of the spec, I think the
503
response status would be a reasonable fit for this failure case (I'd accept an argument for501
, but I think 503 is better).