Closed fogodev closed 1 year ago
even stranger, if you use a slight modified version, you trigger an infinite recursion:
use futures_concurrency::future::{Join, TryJoin};
use std::future::ready;
use tokio::time::{sleep, Duration};
async fn process(
iter: impl IntoIterator<Item = &LoudDrop<i32>>,
fail: bool,
) -> Result<Vec<LoudDrop<i32>>, Vec<()>> {
if fail {
return Err(vec![]);
} else {
sleep(Duration::from_secs(5)).await;
}
Ok(iter
.into_iter()
.map(|i| ready(i.clone()))
.collect::<Vec<_>>()
.join()
.await)
}
#[derive(Clone)]
struct LoudDrop<T>(T);
impl<T> Drop for LoudDrop<T> {
fn drop(&mut self) {
println!("DROP");
}
}
#[tokio::main]
async fn main() -> Result<(), Vec<()>> {
let v = (0..4).map(LoudDrop).collect::<Vec<_>>();
//
// COMMENT THIS LINE AND YOU GET A SEGFAULT
// UNCOMMENT THIS LINE AND YOU GET AN INFINITE RECURSION
//
println!("{}", v.len());
(
process(v.iter().take(2), true),
process(v.iter().take(0), false),
)
.try_join()
.await?;
Ok(())
}
miri is detecting UB in the drop_initialized_values
, here:
https://github.com/yoshuawuyts/futures-concurrency/blob/a2a7f6a17ada682e01a0096e0b2a223c886eba39/src/future/try_join/tuple.rs#L72
Also, worth to note that these errors are only happening with Tokio for some reason, just tested my example with async_std and worked just fine.
Edit: Also tested with smol and futures crate executor, also crashes with them. The unsafe
usage that @matheus-consoli mentioned gives that the outputs are only being properly initialized with async_std crate, with the other executors it's trying to drop invalid data.
Hey, thanks for reporting this!
Im not behind a computer right now, but from reading the code it seems we may be setting the wrong state on line 36 of try_join.rs. We're tagging the entry as "ready" rather than "completed", and then dropping the data. This leads to a double-drop, which is what I believe Miri is finding.
We do want to drop in-line, so I think what we should do is to instead mark it as "completed" and the drop it inline. If anyone wants to try that out and see if it still fails that would be helpful. Otherwise I can give this a try when I'm back at work next week.
Tested here, calling $this.state[$fut_idx].set_none();
just after the ManuallyDrop::drop
line solved this issue. Setting the PollState
to none was the closest thing I could find to mark it as "completed" instead of "ready". Introduced this line on both join and try_join for tuples.
I can help it further, just need some guidance on how to proceed. I'm putting here the tests output. Most of them were just expecting a "ready" state and should be simple to fix for a "none" state or for a new "completed" state, the ones that worried me more are the ones for leaking stuff, namely the assertion at https://github.com/yoshuawuyts/futures-concurrency/blob/a2a7f6a17ada682e01a0096e0b2a223c886eba39/src/future/try_join/tuple.rs#L395-L397
I'm trying to reproduce this bug today. And I believe the bug may actually be in Join
, not TryJoin
. The following test case fails:
use futures_concurrency::future::{Join, TryJoin};
use std::future::ready;
use tokio::time::{sleep, Duration};
async fn process_not_fail() -> Result<Vec<i32>, ()> {
sleep(Duration::from_millis(100)).await;
Ok(vec![ready(1), ready(2)].join().await)
}
async fn process_fail() -> Result<Vec<i32>, ()> {
Err(())
}
#[tokio::test]
async fn test() {
let res = (process_fail(), process_not_fail()).try_join().await;
assert!(res.is_err());
}
But if we replace Vec::join
with Array::join
, then the error stops triggering:
use futures_concurrency::future::{Join, TryJoin};
use std::future::ready;
use tokio::time::{sleep, Duration};
async fn process_not_fail() -> Result<[i32; 2], ()> {
sleep(Duration::from_millis(100)).await;
Ok([ready(1), ready(2)].join().await) // <- changed this line
}
async fn process_fail() -> Result<[i32; 2], ()> {
Err(())
}
#[tokio::test]
async fn test() {
let res = (process_fail(), process_not_fail()).try_join().await;
assert!(res.is_err());
}
I want to see if I can further reduce this to see whether I can trigger it without using try_join
later on.
If we replace the try join of the tuple with a try join of an array, we trigger the same error. This should make this easier to debug since we're no longer needing to track down the source of the error through macro frames in the backtrace.
use futures_concurrency::future::{Join, TryJoin};
use futures_core::Future;
use std::{future::ready, pin::Pin};
use tokio::time::{sleep, Duration};
pub type BoxFuture<'a, T> = Pin<Box<dyn Future<Output = T> + Send + 'a>>;
async fn process_not_fail() -> Result<Vec<i32>, ()> {
sleep(Duration::from_millis(100)).await;
Ok(vec![ready(1), ready(2)].join().await)
}
async fn process_fail() -> Result<Vec<i32>, ()> {
Err(())
}
#[tokio::test]
async fn test() {
let a: BoxFuture<'static, _> = Box::pin(process_fail());
let b: BoxFuture<'static, _> = Box::pin(process_not_fail());
let res = [a, b].try_join().await;
assert!(res.is_err());
}
Ha, I found the bug. It turns out the error was in all of the implementations of TryJoin
, not just the tuple one. Fixing it up for all impls and filing a patch now!
I've been using
futures_concurrency
crate as much as I can at Spacedrive, I prefer its functions and traits approach better than the macros approach offutures
crate. But I was having a weird crash problem with aTryJoin
. I was able to reproduce a minimal example as follows.On MacOS it fires a SIGABRT and on Linux a SIGSEGV (saying that the faulty address is
0x0
)