rayon-rs / rayon

Rayon: A data parallelism library for Rust
Apache License 2.0
11.08k stars 501 forks source link

Add synchronization guarantees for `ParallelIterator::for_each*` #1172

Open andrewsonin opened 5 months ago

andrewsonin commented 5 months ago

Consider the following code:

use std::sync::atomic::{AtomicUsize, Ordering};

use rayon::iter::{IntoParallelIterator, ParallelIterator};

fn main() {
    let mut counter = AtomicUsize::new(0);

    (0..10).into_par_iter()
        .for_each(
            |_| {
                counter.fetch_add(1, Ordering::Relaxed);
            }
        );
    // Can these operations fail?
    assert_eq!(counter.load(Ordering::Relaxed), 10);
    assert_eq!(*counter.get_mut(), 10);
}

The C++20 memory model itself does not guarantee that these assertions will not fail.

Although the rayon implementation does guarantees this (upon .for_each completion the mutex is unlocked, which is always a release-store), the ParallelIterator documentation does not reflect this.

adamreichold commented 5 months ago

I suspect the wording std uses for std::thread::scope might be useful here, i.e.

In terms of atomic memory orderings, the completion of the associated thread synchronizes with this function returning. In other words, all operations performed by that thread happen before all operations that happen after join returns.

from https://doc.rust-lang.org/stable/std/thread/struct.ScopedJoinHandle.html#method.join, since parallel iterators can borrow from the stack and when control resumes on the originator thread all tasks have been joined as if they were running on their own threads.

At least I suspect that this would not limit future implementation options in any material way it is sort of necessary to make the just borrowing safe, e.g. the counter.get_mut() will not use atomic accesses at all and hence cannot soundly race with the fetch_add calls.