Open pkolaczk opened 4 years ago
We don't have a single scope tied to multiple pools, but you could use nested scopes for this, something like:
let common = common_init();
io_pool.scope(|io_scope| {
io_scope.spawn(|_| early_io_work(&common, 1));
io_scope.spawn(|_| early_io_work(&common, 2));
cpu_pool.scope(|cpu_scope| {
cpu_scope.spawn(|_| cpu_work(&common, 1));
io_scope.spawn(|_| io_work(&common, 1));
cpu_scope.spawn(|_| cpu_work(&common, 2));
io_scope.spawn(|_| io_work(&common, 2));
// ...
); // waits for cpu_scope to complete
io_scope.spawn(|_| late_io_work(&common, 1));
io_scope.spawn(|_| late_io_work(&common, 2));
); // waits for io_scope to complete
drop(common);
Ok, now what if the number of thread pools is known only at runtime? Recursion?
Yeah, recursive nesting should be possible.
I guess we might be able to control that better within rayon, since we can do it iteratively and still enforce the 'scope
lifetime. That might look something like:
pub fn scopes<'scope, OP, R>(pools: &[ThreadPool], op: OP) -> R
where
OP: for<'s> FnOnce(&'s [Scope<'scope>]) -> R + 'scope
(or pools
could be any IntoIterator<Item = &ThreadPool>
)
But this seems like an unusual thing to want. I can understand separating CPU/IO pools, although something like tokio might be better for the IO part. What is a situation where you'd want a dynamic number of thread pools?
I have a collection of objects and processing SOME of them require I/O but I/O performance is bad if multiple threads try to access the same hard drive. However, it is perfectly fine to access two or more drives independently. So I need one thread pool per device, and the number of threads is configured by the capabilities of the device (which are different for SSD and HDD).
The first thing I searched for was a map_async method that could accept a lambda returning a Future. I found something like that in the parallel-stream crate, but unfortunately that crate doesnt do the other useful things that rayon does.
As for Tokio, I don't want to bring in yet another big crate onto dependencies, because rayon is fine in the other parts.
Ok, I'm "almost" there. Narrowed down to only one error:
fn nest<'scope, 'pool, 'vec, OP, R>(pools: &'pool[ThreadPool], scopes: Vec<&'vec Scope<'scope>>, op: OP) -> R
where
OP: for<'s> FnOnce(&'s [&'s Scope<'scope>]) -> R + 'scope + Send,
R: Send,
'pool: 'scope,
{
if pools.len() > 0 {
let new_len = pools.len() - 1;
pools[new_len].scope(move |s| {
let mut scopes: Vec<&Scope> = scopes;
scopes.push(s);
nest(&pools[0..new_len], scopes, |scopes| op(scopes))
})
} else {
(op)(&scopes)
}
}
pub fn scopes<'scope, 'pool, OP, R>(pools: &'pool[ThreadPool], op: OP) -> R
where
OP: for<'s> FnOnce(&'s [&'s Scope<'scope>]) -> R + 'scope + Send,
R: Send,
'pool: 'scope
{
nest(pools, Vec::with_capacity(pools.len()), |scopes| op(scopes))
}
error[E0623]: lifetime mismatch
--> src/pools.rs:12:43
|
3 | fn nest<'scope, 'pool, 'vec, OP, R>(pools: &'pool[ThreadPool], scopes: Vec<&'vec Scope<'scope>>, op: OP) -> R
| -------------------
| |
| these two types are declared with different lifetimes...
...
12 | let mut scopes: Vec<&Scope> = scopes;
| ^^^^^^ ...but data from `scopes` flows into `scopes` here
error: aborting due to previous error
For more information about this error, try `rustc --explain E0623`.
I guess this has something to do with the fact that the scope reference added to the vector has a shorter lifetime than the previous scope references in it (encoded in the type of the vector), but I don't know how to change the type of the vector so it can hold the new reference - at the end, I expect the 'vec to match the most nested scope...
I need a Rust lifetime wizard now ;)
I managed to minimize the problem to the following snippet (took out the recursion):
fn nest<'scope, 'vec>(pool: &ThreadPool, scopes: Vec<&'vec Scope<'scope>>)
{
pool.scope(move |s: &Scope| {
let mut v = Vec::new();
v.push(s);
v.push(scopes[0])
})
}
error[E0623]: lifetime mismatch
--> src/pools.rs:8:16
|
3 | fn nest<'scope, 'vec>(pool: &ThreadPool, scopes: Vec<&'vec Scope<'scope>>)
| -------------------
| |
| these two types are declared with different lifetimes...
...
8 | v.push(scopes[0])
| ^^^^^^^^^ ...but data from `scopes` flows into `scopes` here
Commenting out either of push lines makes the compiler happy. This is very weird to me and honestly I got stuck. Why is the compiler complaining here? What does it mean by "these two types" when it is pointing to one... ?
I tried to reproduce this problem without Rayon and now I'm puzzled even more.
This compiles fine:
struct FooScope<'a> { _unused: &'a u32 }
fn with_foo_scope<'scope, F>(op: F)
where F: for <'s> FnOnce(&'s FooScope<'scope>) + 'scope + Send {
}
fn lifetimes<'scope, 'vec>(foos: &'vec [FooScope<'scope>])
{
with_foo_scope(move |s: &FooScope| {
let x = [s, &foos[0]];
})
}
But a very similar code with rayon doesn't:
fn nest<'scope, 'vec>(pool: &ThreadPool, scopes: &'vec [Scope<'scope>])
{
pool.scope(move |s: &Scope| {
let x = [s, &scopes[0]];
})
}
|
17 | fn nest<'scope, 'vec>(pool: &ThreadPool, scopes: &'vec [Scope<'scope>])
| --------------------- these two types are declared with different lifetimes...
...
20 | let x = [s, &scopes[0]];
| ^^^^^^^^^^ ...but data from `scopes` flows into `scopes` here
What is the difference?
@pkolaczk maybe this is due to invariance of Scope<'scope>
in 'scope
.
That looks likely. So is there no way of keeping more than one scope in a collection together?
Indeed, this is a problem with invariance. The offending line is the marker:
pub struct FooScope<'scope> {
marker: PhantomData<Box<dyn FnOnce(&FooScope<'scope>) + Send + Sync + 'scope>>,
}
fn with_foo_scope<'scope, F, R>(op: F)
where F: for <'s> FnOnce(&'s FooScope<'scope>) -> R + 'scope + Send,
R: Send
{
}
fn lifetimes<'scope, 'vec>(scopes: &'vec [FooScope<'scope>])
{
with_foo_scope(move |s: &FooScope| {
let x = [s, &scopes[0]]; // error: data from `scopes` flows into `scopes` here
})
}
Because of the invariance, you would need all the scopes to have the same outer 'scope
lifetime. But 'vec
is necessarily a smaller lifetime, shrinking in each recursive nest, which means the closure can't outlive 'scope
.
So this seems to be something we could only provide from rayon internals, where we can build a Vec<Scope<'scope>>
by value.
In case this is a question, the scope invariance is indeed necessary. Otherwise it would be possible to spawn closures with smaller lifetimes, but the scope would only enforce that they complete before the original scope lifetime, which is too late.
Ok, makes sense. Can you give me any hints on where to start from inside rayon?
See rayon-core/src/scope/mod.rs
fn scope
basically just calls Scope::new
, then complete
which executes the closure (catching panics) and then steal_till_jobs_complete
for the spawns. At a glance this doesn't seem too hard to extend to multiple scopes, although scope creation wants to be in_worker
for the target pool first. I was working on a branch to avoid that for #562, which would probably make it easier here too...
Thank you. Before I dive into Rayon internals, I managed to unblock myself with a little-bit of unsafe. FYI, this is my code:
unsafe fn adjust_lifetime<'s, 'a, 'b>(scope: &'s Scope<'a>) -> &'s Scope<'b> {
std::mem::transmute::<&'s Scope<'a>, &'s Scope<'b>>(scope)
}
fn nest<'scope, OP, R>(pools: &[&ThreadPool], scopes: Vec<&Scope<'scope>>, op: OP) -> R
where
OP: for<'s> FnOnce(&'s [&'s Scope<'scope>]) -> R + 'scope + Send,
R: Send,
{
if pools.len() > 0 {
pools[0].scope(move |s: &Scope| {
let mut vec = scopes;
vec.push(unsafe { adjust_lifetime(s) });
nest(&pools[1..], vec, op)
})
} else {
(op)(&scopes)
}
}
pub fn scopes<'scope, OP, R>(pools: &[&ThreadPool], op: OP) -> R
where
OP: for<'s> FnOnce(&'s [&'s Scope<'scope>]) -> R + 'scope + Send,
R: Send,
{
nest(pools, Vec::with_capacity(pools.len()), op)
}
Is such use of unsafe
safe in this case? This way, seen from the "outside" world, all scopes have the same lifetime as op
lambda. I think it is ok to make the compiler forget about the fact they are nested and created/dropped sequentially. WDYT?
And this is how to use it (for anybody who is interested):
#[test]
fn test_two_pools() {
let pool1 = ThreadPoolBuilder::new().num_threads(1).build().unwrap();
let pool2 = ThreadPoolBuilder::new().num_threads(2).build().unwrap();
scopes(&[&pool1, &pool2], |scopes| {
let s1 = scopes[0];
let s2 = scopes[1];
for _ in 1..10000 {
s1.spawn(|_| println!("thread-pool 1: {:?}", std::thread::current().id()));
}
for _ in 1..10000 {
s2.spawn(|_| println!("thread-pool 2: {:?}", std::thread::current().id()));
}
});
}
I think your unsafe code is OK, in that nothing will actually observe the transmuted lifetimes in a way that could cause unsoundness, but I hope we can do better...
I'm actually wondering now -- I don't think rayon::scope
really needs the OP: 'scope
constraint, because it's executed synchronously. We need 'scope
for spawns because they run asynchronously, so must not borrow from some inner scope. But something like this should be fine:
let vec: Vec<u32> = (0..100).collect();
let slice = &vec[..];
rayon::scope(|s: &rayon::Scope<'static>| {
// We should be able to borrow here, but the spawns have to be static
for &i in slice {
s.spawn(move |_| println!("hello {}!", i));
}
});
If we relax OP
, then I think your original nesting code would just work, creating outermost 'scope
scopes all the way down, while variance happily narrows the 'vec
reference lifetimes.
See #785 -- I hope you don't mind that I adapted your nesting code into a test case.
If anyone comes across this, here's a code snippet. It uses in_place_scope
which is much faster when you have many pools, ~and llvm seems to apply a tail-call optimization which is nice~ (edit: nvm, no tail-call optimization here).
fn scope_all<'a, 'scope>(
pools: impl ExactSizeIterator<Item = &'a rayon::ThreadPool> + Send,
f: impl FnOnce(Vec<&rayon::Scope<'scope>>) + Send + 'scope,
) {
#[inline]
fn recursive_scope<'a, 'scope>(
mut pools: impl Iterator<Item = &'a rayon::ThreadPool> + Send,
scopes: Vec<&rayon::Scope<'scope>>,
f: impl FnOnce(Vec<&rayon::Scope<'scope>>) + Send + 'scope,
) {
match pools.next() {
None => return f(scopes),
Some(pool) => {
pool.in_place_scope(move |s| {
let mut scopes = scopes;
scopes.push(s);
recursive_scope(pools, scopes, f);
});
}
}
}
let vec = Vec::with_capacity(pools.len());
recursive_scope(pools, vec, f)
}
@stevenengler Nice! But yeah, I don't expect tail calls, because each scope has cleanup (waiting for spawns) as they return.
I've got a bunch of tasks to process on many thread pools. E.g. some tasks are CPU bound so should go to the CPU thread pool but some other tasks are I/O bound and I want them processed in a different pool dedicated to I/O. Additionally all tasks need to use a common parent object with lifetime shorter than 'static, so I need a single scope wrapped around all that processing.
Is it possible to do with rayon?