chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 418 forks source link

performance tuning for partial reductions #9303

Open vasslitvinov opened 6 years ago

vasslitvinov commented 6 years ago

Right now partial reductions are ~4x slower than hand-written code on cg-sparse.chpl for Class A.

This task is to investigate and close this gap as much as possible.

bradcray commented 6 years ago

FWIW, I would probably prioritize the design how we'll do partial reductions of anonymous expressions like [(i,j) in D] i * 10.0 + j (rather than simply arrays) over performance tuning the current approach. I don't doubt that we can get the performance of the current approach where we need it, but if we have to change the approach to handle non-array cases, then it doesn't make sense to put too much effort into tuning the current approach just yet.