restatedev / sdk-typescript

Restate SDK for JavaScript/Typescript
MIT License
46 stars 8 forks source link

Introduce error-defer handler / sagas / compensations #94

Closed tillrohrmann closed 9 months ago

tillrohrmann commented 1 year ago

In order to undo partial results in case of failures, we need a way to let the user specify an error-defer handler/compensations (a saga library). These error-defer handlers should only run if a terminal error is thrown (e.g. explicitly by the user or in the form of a CancellationException as part of a cancellation).

The way we register or run the error-defer handlers needs to be able to handle the situation where a sideEffect call is repeatedly failing and a user then cancels the invocation. In this case, we need to make sure that an associated error-defer handler is still executed because the sideEffect could be visible in the external system.

slinkydeveloper commented 1 year ago

I'm not so sure this is an SDK only issue. Before implementing the SDK feature itself, I would like to have an idea of how to implement the cancellation mechanism from the runtime (issue https://github.com/restatedev/restate/issues/402), that is how the runtime can trigger a cancellation of an in-flight or suspended invocation. I would like that we don't take two different directions depending on whether the runtime starts the cancellation, or the sdk.

slinkydeveloper commented 1 year ago

One way I'm thinking about this is for example to have some sort of CancelledEntry message in the journal that marks the beginning of a cancellation. If the SDK starts a cancellation, it generates this entry and sends it to the runtime, together with the other entries generated by the deferred.

If the runtime wants to start a cancellation, it could do so by forcefully killing the stream, inject the CancelledEntry and then start the stream again. Once the SDK reaches the CancelledEntry, it starts the cancellation process. WDYT @tillrohrmann @gvdongen?

tillrohrmann commented 1 year ago

The way I thought about it is that the SDK runs the compensations when it terminates an invocation with a terminal exception. Cancellation from the runtime could exploit the same mechanism by generating a terminal exception.

The way the runtime could achieve this behavior could be exactly what you've suggested: The invoker could inject a CancelledEntry into the journal and then immediately trigger a failure and subsequent re-invocation to start the invocation with the updated journal (now including the CancelledEntry). Once the SDK sees the CancelledEntry, it would throw a terminal exception on the next Restate API call.

slinkydeveloper commented 1 year ago

@gvdongen @StephanEwen one question I have is the following: except logging, do we plan to expose in the code the exception that triggered the cancellation? e.g. by passing it in the defer closures?

StephanEwen commented 1 year ago

Maybe, haven't thought about this, yet, tbh.

Would that change the requirements from the runtime side?

slinkydeveloper commented 1 year ago

Would that change the requirements from the runtime side?

Don't know yet. Still collecting requirements :)

tillrohrmann commented 9 months ago

Closing this issue since we don't need it with the initial milestone of the cancellation feature.