cplusplus / sender-receiver

Issues list for P2300
Apache License 2.0
20 stars 3 forks source link

run_loop::run-loop-sender should remove the item from the list when a stop-request is sent #293

Open lewissbaker opened 1 day ago

lewissbaker commented 1 day ago

The current specification for the run_loop::schedule() operation seems to be to just wait until the worker thread dequeues the task and only then check to see if a stop-request was sent - calling set_stopped() if there was a stop-request, otherwise calling set_value().

This could be problematic if trying to use run_loop with the get_delegation_scheduler feature and could lead to deadlocks.

For example, say someone schedules some work on another thread and wants to block waiting for that work to complete using sync_wait(). This will inject a run_loop scheduler as the get_delegation_scheduler.

The user, wanting to make use of the get_delegation_scheduler, schedules work on a composite scheduler that tries to use a primary scheduler, but also schedules to the delegation scheduler to allow the work to run in either the current thread or on some other context. This way, if all other threads on the other context are busy/blocked then we can still make forward progress on the task using the current thread.

But this approach of scheduling a task on each scheduler and running on whichever completes first only really works if both of the schedulers support "synchronous cancellation". i.e. when a stop-request is sent then either it completes inline in the call to request_stop() with set_stopped() or it is guaranteed to eventually complete with set_value(). i.e. some thread has already dequeued the task and is about to/already calling set_value(). This property allows whichever scheduler completed first to cancel the schedule operation on the other scheduler and then block waiting for the cancellation to finish before then continuing to signal completion.

However, the current specification of run_loop does not have this behaviour and so there is no guarantee that if the other scheduler completed first that the cancelled run_loop-schedule operation will complete in a timely manner (or at all).

LeeHowes commented 22 hours ago

Do you have any thoughts on what the wording change needs to be? This whole design space is difficult to get right so the gap isn't especially surprising.

lewissbaker commented 18 hours ago

The remedy here would be to change the run_loop::run-loop-opstate-base to have a prev pointer, making it a doubly linked list.

Then have run-loop-opstate<Rcvr> have two specialisations. One default one which registers a stop-callback, and one constrained by unstoppable_token<stop_token_of_t<env_of_t<Rcvr>> which does not register a stop-callback.

The one with the stop-callback would call set_stopped() inline in the stop-callback if it successfully removed the item from the queue before the worker thread removed it from the queue.