rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.78k stars 12.5k forks source link

Happens-before relationship between wake and poll #128920

Open Ddystopia opened 1 month ago

Ddystopia commented 1 month ago

Location

Probably here https://doc.rust-lang.org/std/task/struct.RawWakerVTable.html

Summary

Is it safety invariant of executor to provide happens-before relationship between wake and poll? If not, is it a logical one? Or futures must not rely on that? But if there is no happens-before relationship, then data might be missed.

Related discussion: https://users.rust-lang.org/t/must-async-executor-provide-happens-before-relationship/114861?u=ddystopia

There are opposite answers that indicates that community is not really sure what is the answer to that question.

Also related discord question with answer in favor of requiring happens-before relationship

Darksonn commented 1 month ago

It's certainly not a safety requirement. If the runtime forgets to poll a future, that's not going to result in UB. But I would say that it's a reasonable logical requirement. If you call wake, there must be a call to poll after the call to wake. Nothing else makes sense.

Cerber-Ursi commented 1 month ago

I think the question is not "must the executor poll the future?" (it will not do this if future is dropped, for example), but "must the poll be able to witness every changes done before wake?"

Ddystopia commented 1 month ago

It's certainly not a safety requirement. If the runtime forgets to poll a future, that's not going to result in UB. But I would say that it's a reasonable logical requirement. If you call wake, there must be a call to poll after the call to wake. Nothing else makes sense.

I meant, UB if runtime polled but did not establish happens-before relationship

Darksonn commented 1 month ago

I mean, it's definitely not UB to call poll in parallel with a call to wake. Neither function is unsafe.

It's just ... when you call wake, you must call poll on the future after the call to wake. Calls to poll that happen in parallel with the wake don't count, and if a wake happens in parallel with the call to poll, then the runtime would need to call poll again afterwards.

kpreid commented 1 month ago

... when you call wake, you must call poll on the future after the call to wake.

Yes, that's the basic principle of making progress in async tasks. The problem is the details of what this means in multi-threaded conditions, when the poll() is accessing state shared with other threads (e.g. a channel receiver).

If the executor (that is, the thing which implements wake() such that it causes a poll()) doesn't provide a happens-before relationship, and the poll() is executed on a different thread than the wake() was, then it is possible that the poll(), despite having been started after the wake() call in a global time/causality sense, will observe a state of memory in which the effects of the thing done before the wake(), that the poll() is supposed to observe, haven't happened yet, so that the future being polled has effectively been polled before wake() instead of after. The happens-before relationship is required to exclude that case.

Lack of it must not cause UB, of course — the future may not rely on the executor‘s behavior for memory safety, since Future::poll() is not an unsafe fn — but it could cause lack of progress exactly as if the executor forgot to poll the future after wake().

I was confused about this myself, and the URLO thread previously linked now contains clarification.

Ddystopia commented 1 month ago

Not establishing happens-before relationship can lead to deadlock, for example:

Deadlock is sound though. So maybe mention in RawWakerVTable invariants that not providing happens-before relationship can lead to deadlock?

Ddystopia commented 1 month ago

Should it be also mentioned, that executor does not need to do any additional synchronization with LocalWaker? Because they are in one thread, so no need for potentially expensive Release/Acquire operations on the executor site?

Darksonn commented 1 month ago

Things happening on the same thread automatically have happens before relationships. No need for us to special case that.

Ddystopia commented 1 month ago

From Waker docs:

Wake up the task associated with this Waker.

As long as the executor keeps running and the task is not finished, it is guaranteed that each invocation of wake() (or wake_by_ref()) will be followed by at least one poll() of the task to which this Waker belongs. This makes it possible to temporarily yield to other tasks while running potentially unbounded processing loops.

Note that the above implies that multiple wake-ups may be coalesced into a single poll() invocation by the runtime.

Also note that yielding to competing tasks is not guaranteed: it is the executor’s choice which task to run and the executor may choose to run the current task again.

So there actually is a guarantee, that task would get polled after wake. Can it be modified to say that "there will be at least one poll with happens-before relationship to wake"? That will also rule out concerns about parallel polling and waking, as those are "ignored" by the rule