rayon-rs / rayon

Rayon: A data parallelism library for Rust
Apache License 2.0
11.08k stars 501 forks source link

Documentation for `ParallelIterator::fold` and/or `ParallelIterator::reduce` seam to be incorrect #1153

Closed BlockListed closed 7 months ago

BlockListed commented 7 months ago

The documentation for ParallelIterator::fold currently (1.10.0) shows the following examples:

use rayon::prelude::*;

let s =
    ['a', 'b', 'c', 'd', 'e']
    .par_iter()
    .map(|c: &char| format!("{}", c))
    .reduce(|| String::new(),
            |mut a: String, b: String| { a.push_str(&b); a });

assert_eq!(s, "abcde");
use rayon::prelude::*;

let s =
    ['a', 'b', 'c', 'd', 'e']
    .par_iter()
    .fold(|| String::new(),
            |mut s: String, c: &char| { s.push(*c); s })
    .reduce(|| String::new(),
            |mut a: String, b: String| { a.push_str(&b); a });

assert_eq!(s, "abcde");

If the docs for ParallelIterator::reduce are to be believed, then the operation passed to it has to be associative, which String::push_str definitely isn't. Therefore according to my understanding of rayon, at least one of these two is incorrectly documented.

adamreichold commented 7 months ago

the operation passed to it has to be associative, which String::push_str definitely isn't

Isn't it? If you ignore the identity of the underlying storage, then a.push_str(&b) is just a concatenated with b. It only reuses a's memory to avoid an unnecessary allocation.

In particular, if a, b and c are String, then

a.push_str(&b);
a.push_str(&c);

stores the same value in a as

b.push_str(&c);
a.push_str(&b);
adamreichold commented 7 months ago

Or put differently, the operation is

fn concat(mut a: String, b: String) -> String {
  a.push_str(&b);
  a
}

which does fulfil

concat(a, concat(b, c)) == concat(concat(a, b), c)
BlockListed commented 7 months ago

Just realized my issue, I confused associative with commutative. My understanding was that if I had a list [a, b], turned it into a parallel iterator and called reduce on it, it was not determined if the reduction function f(x,y) was called as f(a, b) or f(b, a).

BlockListed commented 7 months ago

Probably should've clicked on the link in the reduce docs, however the fact that the example for reduce was commutative didn't help my confusion.

cuviper commented 7 months ago

We do preserve the order of operands, with the exception of par_bridge().