thomasrolinger / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
0 stars 1 forks source link

Support multiple aggregators in a single forall #34

Closed thomasrolinger closed 2 years ago

thomasrolinger commented 2 years ago

Right now, we only support a single aggregator in a forall. This makes it easy to undo the optimization if it is deemed to be invalid after normalization. We do what we did for the inspector-executor optimization and just wipe out the entire forall loop.

However, we have a use-case for supporting two aggregators in a single forall, specifically graph construction where we have something like:

    var atomicDegrees : [G.domain] atomic int;
    forall i in edges.domain {
        if edges[i](0) != edges[i](1) {
            atomicDegrees[edges[i](0)].add(1);
            atomicDegrees[edges[i](1)].add(1);
        }
    }

The problem is that we can imagine a case where one aggregated operation is deemed valid but the other is invalid. We can't just wipe out the entire forall, since it would also remove the valid optimization.

Instead of cloning the loop and guarding it with an if-then-else structure, we would generate the following:

    var atomicDegrees : [G.domain] atomic int;
    forall i in edges.domain with (var agg1 = ..., var agg2 = ...) {
        if edges[i](0) != edges[i](1) {
           var staticCheckAgg1 = true;
            if staticCheckAgg1 {
               agg1.bufferOp(atomicDegrees[edges[i](0)], 1);
            }
            else {
               atomicDegrees[edges[i](0)].add(1);
            }
            var staticCheckAgg2 = true;
            if staticCheckAgg2 {
               agg2.bufferOp(atomicDegrees[edges[i](1)], 1);
            }
            else {
               atomicDegrees[edges[i](1)].add(1);
            }
        }
    }

During prefolding, we will know whether we can do aggregation in each case. If we find out we cannot for, say agg1, then we will do what we have done in the past: set staticCheckAgg1 to false and let dead-code elimination remove the then-branch. We would also want to go and remove agg1 all-together. This shouldn't be a big task.

Things to consider later

thomasrolinger commented 2 years ago

Handle multiple aggregators in a forall

So far, I can correctly compile the graph construction example above. It took some effort to get the right "entry point" to add the if-then-else structure. The code produces the correct answer and the performance is as expected.

The compiler optimization also still works for the other applications which did not have this structure.

The next thing to do is ensure that we clean up properly when we find an invalid optimization. Right now, we still keep the aggregator around but we should remove it.

thomasrolinger commented 2 years ago

Added some new tests for invalid aggregation. It seems that when we are attempting to create the aggregators, we need to same checks that we use in chpl__isIrregularWriteAggregatable, because if we were trying to do the += operator and didn't have an array of associative arrays, our method for getting the srcType will generate a compiler error.

So the question is whether we even need to create the isAggregatable VarSymbol in the compiler if we will be doing the same checks again. Basically, if our aggregator is nil, then it means we didn't pass the checks. I suppose having the separate checks could be useful for more verbose printing of why we didn't aggregate. The same goes for chpl__isDistributed; we could do that check when we create the aggregators.

For now, I'll keep it in since it doesn't really add a lot of extra code or complexity.

We can remove the aggregator and that seems to work fine.

There was a new commit to address the issue with atomics on the Chapel side: https://github.com/thomasrolinger/chapel/commit/da74a99924d531cc07a1b49582a4de85640c70d7