Open freelon opened 5 years ago
Sure, we can document that. The general expectation will be that order is preserved, but there are exceptions -- e.g. ParallelBridge
documents that it does not preserve order.
I keep needing this functionality through they years (parallel processing over iterator, while preserving order) and I ended up implementing this manually myself in one of my projects (and possibly again in some others)
The idea is to dispatch work to parallel workers with sequential ID, and then assemble it back when converting to a sorted iter, keeping out_of_order
items around on the receiving end until they are needed. Theoretically this can lead to unbounded memory consumption, but in practice - if you dispatch stuff in order, and your processing time is even roughly uniform, the result will arrive roughly in order. It wouldn't be difficult to implement a gating of dispatching to workers based on some shared atomic, to completely prevent possibility of memory explosion, though I never bothered.
Would there be an interest to bring this to rayon?
I'm thinking approach would be something like:
normal_iterator.par_ordered_bridge().map(|item| ... ).filter(|item| ...).to_iter(); // convert to normal iterator
normal_iterator.par_ordered_bridge().map(|item| ... ).filter(|item| ...).collect(); // just collect in order
When I started to look into it it was quickly apparent, that the fact that such bridge would want to pass something else the the underlying item, does not rely fit well into existing types. It would probably require adding another associated type, which would be quite intrusive. So ... I guess I'm not doing it. :D
In the docs it doesn't say anything about whether par_iter()s preserve the original order. #551 says it does, but it would be nice to find this somewhere in the docs, maybe on the page discussing iterators in general.