computablee / DotMP

A collection of powerful abstractions for parallel programming in .NET with an OpenMP-like API.
https://computablee.github.io/DotMP/
GNU Lesser General Public License v2.1
29 stars 7 forks source link

Add `Parallel.ForCollapse` and `Parallel.ForReductionCollapse` #39

Closed computablee closed 1 year ago

computablee commented 1 year ago

OpenMP provides a collapse clause for parallel-for loops as follows:

#pragma omp for collase(2)
for (int i = 0; i < X; i++) {
    for (int j = 0; j < Y; j++) {
        work(i, j);
    }
}

This allows for two adjacently-nested for loops to both be parallelized. Without the collapse(2), the parallel scheduler would have X iterations to work with, but with collapse(2), the parallel scheduler has X*Y iterations to work with.

The way I see this working in DotMP would be something like this:

DotMP.Parallel.ForCollapse(0, X, 0, Y, (i, j) => {
    work(i, j);
});

The main problem is managing the scheduler. The scheduler is already quite a mess because of reductions, so it has to keep up with multiple omp_fn delegates and decide which one to call based on a reduction flag. If a refactor is not done, this problem will only worsen. Thus, this issue asks for a lot.

  1. Refactor the core scheduler to more elegantly handle actions with different parameters.
  2. Implement Parallel.ForCollapse and integrate with the core scheduler elegantly.
  3. Implement Parallel.ForReductionCollapse and integrate with the core scheduler elegantly.
  4. Implement a Parallel.ParallelForCollapse wrapper.
  5. Implement a Parallel.ParallelForReductionCollapse wrapper.
  6. Edit the GEMM example to showcase this feature.

This can be done in pieces, of course. Any work towards this is helpful, though the core scheduler should absolutely be refactored before adding new features.

I would also like to see opinions on how to incorporate higher collapse orders, like a collapse(3) or a collapse(4) elegantly.

computablee commented 1 year ago

Self-assigning this and working on this today.