Open morgante opened 1 week ago
It's a tough balancing act, because usually I hear that the adaptive scheduler was too aggressive. :)
If your input par_iter()
is an IndexedParallelIterator
, then you can force fine granularity by adding .with_max_len(1)
. That will get you roughly the same processing strategy as a bunch of individual spawn
s, but still based on join
recursion. That has its own work-stealing gotchas with latency though -- see #1054.
Thanks for the pointer, I could probably get indexed iteration working but I'm actually not sure I understand what wins I'm getting there over just spawning each thread separately.
My ideal would actually be something like this:
into_some_iter()
with for_each
.Is there a way to do this with rayon or will I need to build my own scheduler / switch to tokio?
but I'm actually not sure I understand what wins I'm getting there over just spawning each thread separately.
The benefit is that you would still be using the shared thread pool, even if your number of items is much larger than the number of threads. They'll just be "scheduled" individually if you add .with_max_len(1)
.
What is your source that you're calling par_iter()
on? Many types do support the indexed API already.
I have an iterator where the work to be done for each task varies widely—most items finish very quickly, but a few take orders of magnitude longer.
It appears from my quick experiment that Rayon is being somewhat inefficient here since
spawn
performs much better than theParallelIterator
. It looks like Rayon is still queuing up fast tasks behind the slow task on the same thread pool, or otherwise running fast tasks after the slow task.The optimal strategy here, which spawn seems to support, is to continue chewing through tasks on other threads while 1 thread stays stuck on the slow task.
Is there a way to do this with
par_iter
which has much nicer iteration semantics, or do I need to manage spawning myself?