Rethink cancellation - Githubissues

ericniebler commented 9 months ago

Issue by jamboree Monday Aug 15, 2022 at 15:19 GMT Originally opened as https://github.com/NVIDIA/stdexec/issues/573

In the current design, cancellation support for an operation is externalized and somewhat convoluted, you have to get_stop_token from the receiver, check if stop_requested, register a callback during start and unregister it upon complete. Things could be made simpler if cancellation is intrinsic to the operation-state. That is, in addition to a start function, an operation-state should also has a drop/stop function.

The rules:

drop can only be called after start, potentailly from another thread.
drop can be called 0 or 1 time.
drop after completion is permitted and has no effect (still shouldn't be called more than once).
drop doesn't necessarily result in set_stopped being called; set_stopped being called doesn't necessarily caused by drop.

Also, it should be made clear that receiver contract is only fulfilled after start. Simply connect and destruct the operation-state should not invoke any channel of the receiver.

ericniebler commented 9 months ago

Comment by ericniebler Thursday Jun 15, 2023 at 20:54 GMT

you have to get_stop_token from the receiver, check if stop_requested, register a callback during start and unregister it upon complete.

I've wondered whether this pattern can be captured in a reusable component.

drop after completion is permitted and has no effect (still shouldn't be called more than once).

This would be problematic since typically the completion functions free resources, including the operation state itself.

ericniebler commented 9 months ago

Comment by maikel Friday Jun 16, 2023 at 04:44 GMT

This would be problematic since typically the completion functions free resources, including the operation state itself.

In general, this is indeed a problem. But there are situations where a droppable op state concept might be sensible, such as in a repeat algorithm.

I have implemented a test where the caller of drop controls the lifetime of the droppable operation. This happens in sequence senders quite often where a sequence manages one are more op states for it's elements.

ericniebler commented 9 months ago

Comment by jamboree Friday Jun 16, 2023 at 05:39 GMT

drop after completion is permitted and has no effect (still shouldn't be called more than once).

This would be problematic since typically the completion functions free resources, including the operation state itself.

The point is that there's an eventual owner of the operation state, if the caller is able to call drop on the operation state, the operation state must still be alive, and it doesn't matter the operation is completed or not.

lewissbaker commented 7 months ago

The challenge with having cancellation as a method on the operation-state is that often a receiver that receives completion signal will want to go and destroy the operation-state object upon completion. If it is potentially going to be making a call to a method on that object then it needs to synchronize with itself to make sure that any calls to drop() "happen before" the destruction of the operation-state, or otherwise do not happen.

Also, the implementation of drop() needs to be able to handle executing concurrently with the completion of the operation. e.g. on some thread it might be preparing to call set_value() on the receiver to complete when concurrently on another thread a call to drop() is coming in. It also needs to handle the case where a thread is in the process of calling drop() and then another thread completes the operation.

I haven't worked through an implementation of this fully yet, so I don't know how much you can factor the sychronization out into separate helpers, but my suspicion is that by the time you have done enough synchronization to handle all of the various potential races that can occur that the design doesn't end up being that much simpler. Having said that, I'd love to be proven wrong here!

The other aspect that this design loses is the ability for the implementation of the algorithm to know whether or not the caller might potentially call drop(). If an implementation knows it will never receive a stop-request then it can potentially use a more efficient implementation strategy. Whereas if the operation-state always had to provide a drop() then it would have to pessimistically assume that it might be called. The current design allows the caller to communicate whether or not it might request stop by passing a stop-token through the receiver's environment with the stop_possible() that returns either 'true' or 'false'.

lewissbaker commented 4 months ago

I have been working on some prototypes along the lines suggested here and will report back with outcomes of that exploration.

cplusplus / sender-receiver

Rethink cancellation #73