rayon-rs / rayon

Rayon: A data parallelism library for Rust
Apache License 2.0
11.02k stars 501 forks source link

parallel iter for combinations? #719

Open ian-p-cooke opened 4 years ago

ian-p-cooke commented 4 years ago

I'm currently using https://docs.rs/itertools/0.8.2/itertools/trait.Itertools.html#method.combinations and found a great speed-up by collecting into a Vec and then using into_par_iter. I'm wondering if that's the best way though? Is there a way to avoid the collect?

example:

items.combinations(4)
        .collect::<Vec<Vec<Item>>>()
        .into_par_iter()
        .for_each(|_| blah)

I'm amenable to implementing whatever's necessary for 'native' par_iter support for the Combinations struct but could use a pointer to documentation if that's available.

Thanks!

cuviper commented 4 years ago

I don't think we can implement this for itertools::Combinations, because we don't have access to its internals to try and do anything in parallel. You could use it like any arbitrary Iterator using par_bridge though, incurring some amount of synchronization cost. That might still be faster than collecting all combinations to a Vec first.

We could think about our own native Combinations adapter, but I think it will be quite complex to implement this in a lazy way. Collecting the seed items to a Vec first would be a little easier, as then you're just parallelizing known index ranges, but I don't have an idea how to deal with arbitrary combination lengths in a nice way.

ian-p-cooke commented 4 years ago

please do think about a native Combinations but it's ok for me to use collect for now. My collected Vec is only about 15Kb and the win from collect + into_par_iter is big (1.4s to .4s). Since my goal was to be under a second that should be good enough.

I tried par_bridge after reading your comment and that was slower (0.4s vs 1.2s) than collecting first so I'm going to stick with the collect. Thank you for the reference though, I wasn't aware of it.

Thanks for a great crate! My serial implementation was pretty much at it's limit and adding effectively one line of code to reach my goal is awesome.