ietf-wg-httpapi / idempotency

Repository for "The Idempotency-Key HTTP Header Field"
Other
17 stars 10 forks source link

Relationship to Repeatable Requests? #31

Open mnot opened 10 months ago

mnot commented 10 months ago

OASIS OData has published this: https://docs.oasis-open.org/odata/repeatable-requests/v1.0/cs01/repeatable-requests-v1.0-cs01.html

Have we evaluated this? If so, how does this proposal differ / what use cases does it address that aren't by Repeatable Requests?

mnot commented 10 months ago

I see this was previously discussed in #5.

@darrelmiller any thoughts about the questions above? If we feel that RR doesn't address some use cases, or is badly engineered, it's reasonable to work on a separate spec. However, if it addresses our use cases and is well-engineered, duplication doesn't seem great.

richsalz commented 10 months ago

I read the repeatable-requests document. It is more complicated than what is in this draft. I agree with @darrelmiller's comments when he closed #5, https://github.com/ietf-wg-httpapi/idempotency/issues/5#issuecomment-1073121522

mnot commented 10 months ago

It is, but having two solutions to the same problem also make the world more complicated. I don't have strong feelings about this, but we should be able to answer the question crisply.

asbjornu commented 10 months ago

@mnot how do you feel the question should be answered if @darrelmiller's answer in https://github.com/ietf-wg-httpapi/idempotency/issues/5#issuecomment-1073121522 doesn't suffice?

I believe this issue has served its purpose to have a conversation around the two different specifications. I think the OASIS specification defines an approach that while potentially offering more features, it puts some additional constraints on server implementations. I think it is reasonable to expect that both the Repeatable Request approach and Idempotency Key approach will find use cases where one of them is the optimum choice.

I'm going to close this issue. Feel free to re-open if further discussion is required.

Should we offer a comparison between Idempotency-Key and Repeatable Requests in the RFC, or are you thinking about something else?

mnot commented 10 months ago

No, I don't think we should say something in the document (unless there's a direct relationship) -- but we should be able to answer questions like this crisply and convincingly, because they're likely to come up in review.

darrelmiller commented 9 months ago

The most recent change to the specification https://github.com/ietf-wg-httpapi/idempotency/pull/27 that allows clients to send the idemptotency key opportunistically is a significantly different approach to that of Repeatable Requests. Idempotency-Key allows a client to express the intent of their request, regardless of whether a server has idempotency support. This opens the door for generic connector software being build for expressing that intent and potentially generic intermediaries that could track requests and block duplicates. Repeatable Reads require a significant amount of client/server co-ordination and likely the reason we do not see widespread support.

jmileham commented 7 months ago

This opens the door for generic connector software being build for expressing that intent and potentially generic intermediaries that could track requests and block duplicates. Repeatable Reads require a significant amount of client/server co-ordination and likely the reason we do not see widespread support.

In reading up on the literature as we did our idempotency implementation, we found the OASIS spec very validating because we had also independently arrived at the value of a First-Sent header as a way of providing a soundness guarantee around expiry (with a non-adversarial client; failing safe under clock skew) without requiring the client and server to agree out of band about the expiry window. The First-Sent header also allows the server to subsequently change its expiry window dynamically without client coordination.

So we see that additional bit of protocol complexity as valuable. But I'm not totally understanding why the OASIS spec couldn't be implemented as a generic intermediary service as well?

That said, the OASIS spec is very prescriptive about response codes, and has a lot more MUSTs than SHOULDs relative to this spec, for sure. I think our idempotency implementation is compliant with this draft, if you squint, and would not be with the OASIS spec, mostly for that reason.

Another thing that I appreciate about the OASIS spec, though, is the explicit statement that you should only record idempotency keys upon success, i.e. validation errors don't burn tokens, which is very helpful for keeping the client code simple for user-in-the-loop mutative actions. I believe this document doesn't specify whether to record idempotency keys upon client errors, but I could be missing something.

jmileham commented 7 months ago

I reread the Repeatability spec and I think I answered my own question:

[me] But I'm not totally understanding why the OASIS spec couldn't be implemented as a generic intermediary service as well?

There are two parts that jumped out at me as reasons it might not be possible to implement as a progressive enhancement and as a generic intermediary - are these what y'all have also concluded?

Servers aware of repeatability but unable to fulfill this direction for this request type MUST not execute the request and instead return 501 Not Implemented.

If "aware of repeatability" is defined as "built by somebody aware that the repeatability spec exists" then that seems to rule out progressive enhancement (or at least would make the protocol a little chatty trying to negotiate repeatability for each request across all web services). But if the definition is more along the lines of "on a server that implements repeatability for some operations" then it seems more reasonable that it could be seen as a progressive enhancement.

And on the subject of whether an intermediary service could faithfully implement:

Whether a server is considered to have seen a previous request should be transactionally consistent with the mutating effects of the request. For example, a server is not required to remember a previous request whose effects were rolled back due to a failure, since the client could reissue such a request without any possibility for duplication of the effects.

If a strict definition of "transactionally consistent" is applied, that probably rules out providing readability via an intermediary service. But the fact that the word "should" is used (and not even "SHOULD") might soften that expectation. The spec also says:

If the server has seen the Repeatability-Request-ID and the request matches the previous request to the extent validated by the server, the server MUST return a response with a Repeatability-Result value of accepted that is either:

  • the same response code and body as was generated (if any) when the original request with that Repeatability-Request-ID was processed, or
  • the response code and response body resulting from re-executing the request if the original response code was 4xx or 5xx, i.e. a client error or an internal server error.

Which to me hints that inspecting the response code might be considered an acceptable way of determining whether an underlying request had side effects.

So part of me wonders what the intent of the OASIS spec is in these regards.

All that said, one of the reasons we're doing our own implementation of a very opinionated idempotency system is because we value transactional consistency between the idempotency layer and the transactional side effects. (Without transactional consistency, idempotency cannot be guaranteed by wrapping a non-internally-idempotent service in the event of network partition between the wrapper and the transactional service. Edited to add: It can only provide at-least-once or at-most-once function application. Stripe appears, e.g., to choose at most once application, which fails safe in the sense that it presumably surfaces an error on retry if a network partition occured between the idempotency service and the underlying service during initial application.) So it wouldn't shock me if repeatability spec means business about the transactional consistency requirement.

Anyway, I'll stop here but happy to engage further if helpful. Thanks for reading and thanks for all the effort on this document so far!

mikekistler commented 6 months ago

@jmileham Regarding

I believe this document doesn't specify whether to record idempotency keys upon client errors, but I could be missing something.

Draft 4 of the RFC says this in Section 2.6 under "Retry":

The resource server SHOULD respond with the result of the previously completed operation, success or an error.

I take that to mean that idempotency keys SHOULD record all errors, be they client errors or service errors.

This seems consistent with the Stripe Implementation -- Stripe's docs say:

Subsequent requests with the same key return the same result, including 500 errors.

This seems to be another aspect where the RFC differs from the OASIS spec.

richsalz commented 6 months ago

I take that to mean that idempotency keys SHOULD record all errors, be they client errors or service errors.

Taken by itself, the interpretation of that sentence could be ambiguous. Want to open an issue to make sure it's made clear and explicit?

jmileham commented 6 months ago

@richsalz I assume that was an ask of @mikekistler, and admittedly I'm an outsider here but I'd be happy to open that issue. I'd also be happy to make a pitch for the IMO preferable design tradeoffs of not recording a request when it fails on the downstream service, at least due to client (4xx) error. But I don't know the state of conversation and whether this is mostly about specifying the Stripe-style idempotency filter design.

mikekistler commented 6 months ago

@richsalz I'm not sure what more needs to be stated to make the interpretation unambiguous. Are you thinking something like:

The resource server SHOULD respond with the result of the previously completed operation, success or any error, be it client or server error.

richsalz commented 6 months ago

I believe the sentence is not clear in cases like this: If an operation returns a transient error, like 504, and a request is repeated, should the server still return 504 if there would have been no timeout the second time around?

Does "previously completed operation" cover error responses? Should the text say "previous response (including errors)"? Do 504 codes, etc., need to be shared between the gateway and origin, for example?

richsalz commented 6 months ago

@jmileham please contribute. As this thread has migrated, perhaps start a new issue.