LLNL / RAJA

RAJA Performance Portability Layer (C++)
BSD 3-Clause "New" or "Revised" License
450 stars 102 forks source link

MultiReducer and Reducer Design #1648

Open MrBurmark opened 1 month ago

MrBurmark commented 1 month ago

There are some thoughts that I wanted to capture while thinking about designing and testing multi reducers.

I'm currently designing around a multi-reducer interface that looks like this.

using reduce_policy = RAJA::seq_multi_reduce;
using exec_policy = RAJA::seq_exec;

RAJA::Index_type N = 100;
size_t num_bins = 10;
const int* bins = ...;
const double* data = ...;
double* sums = ...;

// Constructor taking num_bins and initializing all bins to the identity of the operator (in this case 0)
RAJA::MultiReduceSum<reduce_policy , double> multi_reduce_sum(num_bins);

RAJA::forall<exec_policy>(RAJA::RangeSegment(0, N), [=](RAJA::Index_type i) {
  multi_reduce_sum[bins[i]] += data[i];
});

// get each value individually
for (size_t bin = 0; bin < num_bins; ++bin) {
  sums[bin] = multi_reduce_sum.get(bin) ;
}
// or get all values in one call put into the values referenced by an iterator
multi_reduce_sum.get_all(sums);

multi_reduce_sum.reset();
// should we have a constructor that takes num_bins and an initial value to copy to each bin?
double init_value = 1;
RAJA::MultiReduceSum<reduce_policy , double> multi_reduce_sum(num_bins, init_value);

// should we have a constructor that takes num_bins and an iterator to initial values?
const double* init_values = ...;
RAJA::MultiReduceSum<reduce_policy , double> multi_reduce_sum(num_bins, init_values);

// should we have a constructor that takes a container to get num_bins and initial_values?
const std::vector<double> container_of_init_values(...);
RAJA::MultiReduceSum<reduce_policy , double> multi_reduce_sum(container_of_init_values);
// what should accumulating to a bin look like?
multi_reduce_sum[bins[i]] += data[i];
multi_reduce_sum(bins[i]) += data[i];
multi_reduce_sum.add(bins[i], data[i]);
// What should the interface to get the final values look like?
for (size_t bin = 0; bin < num_bins; ++bin) {

  // use get with a bin number to get each value individually?
  sums[bin] = multi_reduce_sum.get(bin) ;

  // allow the reference returned by indexing to implicitly cast to a value, similar to normal reducers?
  sums[bin] = multi_reduce_sum[bin] ;

}

// Should the multi reduce object act like a container?
for (auto const& val : multi_reduce_sum) { ... }
// Should there be an interface to get all of the final values at once and what should it look like?

// use a function that assigns each of the values to an output iterator?
multi_reduce_sum.get_all(sums);

// use a function to get a view of all the values in the multi-reducer?
auto view = multi_reduce_sum.get_all();
// should reset allow you to change the number of bins?
size_t new_num_bins = 14;
multi_reduce_sum.reset(new_num_bins);
rhornung67 commented 1 month ago

Please move the items about the current state of reduction tests to a separate issue. Then we will address those in a separate PR.

MrBurmark commented 1 month ago

Moved to #1649.

artv3 commented 1 month ago

I think this is great, but I do wonder if maybe the development should happen within the new reducer interface to start motivating folks to transition over.

MrBurmark commented 1 month ago

I'm of two minds on that because there are some simplifications and optimizations that could be applied with the new reducer implementation, but I haven't actually started using the new reduction interface and I'd like to start using the quickly when it is completed.