fix(otlp-spec): Clarified "reject" meaning of the HTTP PartialSuccess explanation.

open-telemetry / opentelemetry-proto

OpenTelemetry protocol (OTLP) specification and Protobuf definitions

https://opentelemetry.io/docs/specs/otlp/

Apache License 2.0

545 stars 242 forks source link

fix(otlp-spec): Clarified "reject" meaning of the HTTP PartialSuccess explanation. #539

Closed bwplotka closed 1 month ago

bwplotka commented 3 months ago

👋🏽

I was studying the OTLP spec's partial success mechanism proposed 2y ago for the Prometheus Remote Write 2.0 spec design purposes. I noticed one detail that might need clarification. The partial success mechanism has this specification regarding the "final" status code:

If the request is only partially accepted (i.e. when the server accepts only parts of the data and rejects the rest), the server MUST respond with HTTP 200 OK.

Does it mean that if e.g. we write 10 samples and server writes successfully 9, but can't write 1 sample due to some tmp error (503), it should still provide 200 status code? Or that 503 code? It feels the latter make more sense to me, but this sentence is a bit fuzzy.

One way of interpreting the "reject" word in that sentence, is that server rejecting samples means not-retryable errors only (e.g. 400). If that's true then some clarification would be nice. I went ahead and proposed changes to make it explicit. Is my assumption here correct? Is this helpful?

bwplotka commented 3 months ago

Thanks for a quick response!

Yes, it says HTTP 200 must be returned even if part of the data cannot be accepted. For metrics the rejected_data_points field contain the number of data points that the server did not accept.

Just to clarify: "cannot be accepted" also INCLUDES retryable failures? The same ambiguity exists in the current wording ("server rejects"), which I am trying to interpret here.

To be very clear again--do you mean that current OTLP spec says:

"HTTP 200 must be returned even if part of the data was written, but for the other part server asked to retry (e.g. due to tmp error)?"

To be very specific - this occurs often when OTLP (or any other stream of samples) is going through proxies that reshards written series and parts of it was written, parts of it requires a retry.

tigrannajaryan commented 3 months ago

Just to clarify: "cannot be accepted" also INCLUDES retryable failures? The same ambiguity exists in the current wording ("server rejects"), which I am trying to interpret here.

Yes, that is correct.

When it says "reject" it probably should say "drop".

Essentially the server can do one of 4 things:

Full Success. Accept everything successfully and return 200.
Failure, Retryable. Reject everything and return one of the retryable HTTP response codes.
Failure, Non-retryable. Reject everything and return 400.
Accept parts of data, drop some other parts, return 200 and indicate that parts of data were dropped by setting non-zero rejected_data_points value.

There is no provision for accepting parts of data and somehow indicate that the remaining data must be retried. To make this possible the response would need to describe which part of the data exactly should be retried and there is simply no way to do it currently.

jsuereth commented 3 months ago

@tigrannajaryan #4 is NOT in-line with what my understanding was for this feature when it was added and I think is actually pretty detrimental if retryable failure can return 200.

Specifically, because the existing rejection doesn't tell you which points are retry-able, and in practice could lead to very poor network usage and overloading. I think this is a pretty glaring hole in OTLP if we allowed that, and we should work on fixing this ASAP if true.

tigrannajaryan commented 3 months ago

@tigrannajaryan #4 is NOT in-line with what my understanding was for this feature when it was added and I think is actually pretty detrimental if retryable failure can return 200.

@jsuereth are you reading the spec differently? I don't know how else to interpret the spec.

Specifically, because the existing rejection doesn't tell you which points are retry-able, and in practice could lead to very poor network usage and overloading. I think this is a pretty glaring hole in OTLP if we allowed that, and we should work on fixing this ASAP if true.

Yes, what you say is correct. It is a hole, but that's the way it is. The plan was that we can fix it by adding the required information to Export*PartialSuccess messages when the need arises. See for example https://github.com/open-telemetry/opentelemetry-specification/issues/2454#issuecomment-1091711982

tigrannajaryan commented 2 months ago

@jsuereth let me know if you read the spec differently. We may need to change the wording to make it clearer.

bwplotka commented 2 months ago

Thanks!

I agree supporting partial-retry-ability (as you proposed here) with detailed response semantics allowing to retry certain series is out of question for now, it's complex and might be not needed. However this is NOT what I am asking here.

My point is that the current wording does NOT support case of:

Server being explicitly idempotent
Server written part of samples, but it wants FULL request to be retried.

Spec says, it's not possible, always return 200 because you ingested something already.

If that's the intention and is acceptable, then nothing to change (other than maybe clarification).

tigrannajaryan commented 1 month ago

@bwplotka the case you are looking for (idempotent server) is possible to fake today if you control the server implementation. Make the server return one of the retryable error codes and the client will re-send the entire batch. It won't look pretty since it will likely be logged as an error by the client but the net result will be what you expect: the client will retry the full request.

I don't think this is great, but it should work.

tigrannajaryan commented 1 month ago

Closing this PR for now, I think we have clarity that the PR's changes are considered breaking and we are not allowed that.

However I am open to extending the partial success to cover more use cases if it is done while preserving interoperability with all protocol versions (I have not put any thought into how exactly it can be done).