Open Darksonn opened 3 years ago
yeah, i think you're right that the waiter should be in an UnsafeCell
. this should actually be pretty straightforward to fix, i think. good catch, @kprotty!
yeah, i think you're right that the waiter should be in an
UnsafeCell
. this should actually be pretty straightforward to fix, i think. good catch, @kprotty!
Tried following your guide, and still fails. What am I missing?
diff --git a/tokio/src/sync/batch_semaphore.rs b/tokio/src/sync/batch_semaphore.rs
index 803f2a18..49d562bc 100644
--- a/tokio/src/sync/batch_semaphore.rs
+++ b/tokio/src/sync/batch_semaphore.rs
@@ -20,6 +20,7 @@ use crate::loom::sync::atomic::AtomicUsize;
use crate::loom::sync::{Mutex, MutexGuard};
use crate::util::linked_list::{self, LinkedList};
+use std::cell::UnsafeCell as UnsafeCellSTD;
use std::future::Future;
use std::marker::PhantomPinned;
use std::pin::Pin;
@@ -65,7 +66,7 @@ pub enum TryAcquireError {
pub struct AcquireError(());
pub(crate) struct Acquire<'a> {
- node: Waiter,
+ node: UnsafeCellSTD<Waiter>,
semaphore: &'a Semaphore,
num_permits: u32,
queued: bool,
@@ -458,7 +459,7 @@ impl Future for Acquire<'_> {
impl<'a> Acquire<'a> {
fn new(semaphore: &'a Semaphore, num_permits: u32) -> Self {
Self {
- node: Waiter::new(num_permits),
+ node: UnsafeCellSTD::new(Waiter::new(num_permits)),
semaphore,
num_permits,
queued: false,
@@ -476,7 +477,7 @@ impl<'a> Acquire<'a> {
let this = self.get_unchecked_mut();
(
- Pin::new_unchecked(&mut this.node),
+ Pin::new_unchecked(this.node.get_mut()),
&this.semaphore,
this.num_permits,
&mut this.queued,
@@ -499,11 +500,11 @@ impl Drop for Acquire<'_> {
let mut waiters = self.semaphore.waiters.lock();
// remove the entry from the list
- let node = NonNull::from(&mut self.node);
+ let node = NonNull::from(self.node.get_mut());
// Safety: we have locked the wait list.
unsafe { waiters.queue.remove(node) };
- let acquired_permits = self.num_permits as usize - self.node.state.load(Acquire);
+ let acquired_permits = self.num_permits as usize - self.node.get_mut().state.load(Acquire);
if acquired_permits > 0 {
self.semaphore.add_permits_locked(acquired_permits, waiters);
}
Further analysis has showed that it is not so simple.
@blasrodri The usage of get_mut
on the UnsafeCell there looks like an issue. It becomes more vivid where an atomic load, which implies shared references, is being done on a mutable reference.
Worst case (just speculation) this might be more fundamental of a change. With the existence of Pin<&mut Self>
from the Future trait, and the need to have other threads performing updates concurrently on the memory of something in that Pin<&mut Self>
, you would need to have that thread-shared memory say to the compiler that "a mutable reference and immutable references to this is not UB" which UnsafeCell doesn't seem to currently denote. This would also imply that things like futures-intrusive
are unsound as well.
Fwiw, this pattern also appears in sync::Notify iirc so it may not be only this Future type to address. One way to handle this is by heap allocating the shared state in order to keep the "only shared references" property in the face of Pin<&mut Self>
. While this addresses it, its a poor compromise to introduce runtime overhead simply due to not enough language support.
Relevant thread: Are sound self-referential structs possible without boxing?
The answer appears to be "No, it's not possible".
Miri should stop complaining about this once #4397 is merged.
FWIW, It's possible to fix this soundness issue while still using intrusive Futures. You just need to store the intrusive Node outside the Future that's being polled, then have said Future reference it. This avoids the Pin<&mut Self> transitively creating a mutable reference to the intrusive Node. In reality however, the Node would still be stored in the async fn compiler's generated Future which would still have this issue of transitive pin mut ref when polled but at least its outside user written code and now a fundamental issue with Rust async.
In general, we are waiting for a resolution to the problem where async Rust itself is unsound. I don't think what you suggested changes anything regarding what miri will accept.
Note that JoinSet
also trivially triggers miri for me right now:
let mut join_set = JoinSet::new();
join_set.spawn(async move {});
join_set.join_next().await;
@udoprog That error is unrelated to what's discussed in this issue. I filed #5693.
The following code has UB:
To see this, run it with:
The problem was discovered by @kprotty on the discord server.
The solution probably looks something like putting the
Waiter
inAcquire
inside anUnsafeCell
and only using raw pointers to access it, instead of mutable references.Click to see miri failure
``` error: Undefined Behavior: trying to reborrow for SharedReadWrite at alloc1987, but parent tag <14056> does not have an appropriate item in the borrow stack --> /home/alice/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/non_null.rs:212:18 | 212 | unsafe { &*self.as_ptr() } | ^^^^^^^^^^^^^^^ trying to reborrow for SharedReadWrite at alloc1987, but parent tag <14056> does not have an appropriate item in the borrow stack | = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information = note: inside `std::ptr::NonNull::