rayon-rs / rayon

Rayon: A data parallelism library for Rust
Apache License 2.0
10.6k stars 488 forks source link

Document order preservation #669

Open freelon opened 5 years ago

freelon commented 5 years ago

In the docs it doesn't say anything about whether par_iter()s preserve the original order. #551 says it does, but it would be nice to find this somewhere in the docs, maybe on the page discussing iterators in general.

cuviper commented 5 years ago

Sure, we can document that. The general expectation will be that order is preserved, but there are exceptions -- e.g. ParallelBridge documents that it does not preserve order.

dpc commented 2 years ago

I keep needing this functionality through they years (parallel processing over iterator, while preserving order) and I ended up implementing this manually myself in one of my projects (and possibly again in some others)

The idea is to dispatch work to parallel workers with sequential ID, and then assemble it back when converting to a sorted iter, keeping out_of_order items around on the receiving end until they are needed. Theoretically this can lead to unbounded memory consumption, but in practice - if you dispatch stuff in order, and your processing time is even roughly uniform, the result will arrive roughly in order. It wouldn't be difficult to implement a gating of dispatching to workers based on some shared atomic, to completely prevent possibility of memory explosion, though I never bothered.

Would there be an interest to bring this to rayon?

I'm thinking approach would be something like:

normal_iterator.par_ordered_bridge().map(|item| ... ).filter(|item| ...).to_iter(); // convert to normal iterator
normal_iterator.par_ordered_bridge().map(|item| ... ).filter(|item| ...).collect(); // just collect in order
dpc commented 2 years ago

When I started to look into it it was quickly apparent, that the fact that such bridge would want to pass something else the the underlying item, does not rely fit well into existing types. It would probably require adding another associated type, which would be quite intrusive. So ... I guess I'm not doing it. :D