Closed tillrohrmann closed 9 months ago
I'm not so sure this is an SDK only issue. Before implementing the SDK feature itself, I would like to have an idea of how to implement the cancellation mechanism from the runtime (issue https://github.com/restatedev/restate/issues/402), that is how the runtime can trigger a cancellation of an in-flight or suspended invocation. I would like that we don't take two different directions depending on whether the runtime starts the cancellation, or the sdk.
One way I'm thinking about this is for example to have some sort of CancelledEntry
message in the journal that marks the beginning of a cancellation. If the SDK starts a cancellation, it generates this entry and sends it to the runtime, together with the other entries generated by the deferred.
If the runtime wants to start a cancellation, it could do so by forcefully killing the stream, inject the CancelledEntry
and then start the stream again. Once the SDK reaches the CancelledEntry
, it starts the cancellation process. WDYT @tillrohrmann @gvdongen?
The way I thought about it is that the SDK runs the compensations when it terminates an invocation with a terminal exception. Cancellation from the runtime could exploit the same mechanism by generating a terminal exception.
The way the runtime could achieve this behavior could be exactly what you've suggested: The invoker could inject a CancelledEntry
into the journal and then immediately trigger a failure and subsequent re-invocation to start the invocation with the updated journal (now including the CancelledEntry
). Once the SDK sees the CancelledEntry
, it would throw a terminal exception on the next Restate API call.
@gvdongen @StephanEwen one question I have is the following: except logging, do we plan to expose in the code the exception that triggered the cancellation? e.g. by passing it in the defer closures?
Maybe, haven't thought about this, yet, tbh.
Would that change the requirements from the runtime side?
Would that change the requirements from the runtime side?
Don't know yet. Still collecting requirements :)
Closing this issue since we don't need it with the initial milestone of the cancellation feature.
In order to undo partial results in case of failures, we need a way to let the user specify an error-defer handler/compensations (a saga library). These error-defer handlers should only run if a terminal error is thrown (e.g. explicitly by the user or in the form of a
CancellationException
as part of a cancellation).The way we register or run the error-defer handlers needs to be able to handle the situation where a
sideEffect
call is repeatedly failing and a user then cancels the invocation. In this case, we need to make sure that an associated error-defer handler is still executed because the sideEffect could be visible in the external system.