thomasrolinger / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
0 stars 1 forks source link

Task: Handle invalid optimization at a call site #20

Closed thomasrolinger closed 2 years ago

thomasrolinger commented 2 years ago

We support optimizing a forall in a function that has multiple call sites. However, we do not currently treat each call site separately when looking for things that invalidate the optimization.

For example:

proc func(A, B, C)
{
  forall i in B.domain {
    C[i] += A[B[i]];
  }
}

for it in 0..#4 {
  B2[0] = 1;
  func(A,B1,C);
}
for it in 0..#5 {
  B1[0] = 1;
  func(A, B2, C);
}

In the above example, our current optimization sees that the formal B in func() can be either B1 or B2, so it tracks changes to both. However, when it looks at whether the writes happen in valid locations, it starts from the perspective of the forall in func() and does not take into consideration which call site it searches from. In the end, it'll first see the write to B2 before the first call site and match it with one of the symbols it was tracking. It will then declare the optimization as invalid. However, it should be valid at this call site since B2 is not used in the function here.

One approach to fix this is to use function cloning, like we did before. It would create two versions of func() that are identical, but each is only called from one location. Essentially, we have two optimizations. But this wouldn’t address all cases, as we could have one call site in a function and then multiple call sites of that enclosing function. This is partly why associating a flag with the array itself was beneficial.

I'll have to think about whether it is worth trying to address. At the moment, there are no use cases that stress this. In fact, our optimization will be aggressive and invalidate itself, rather than produce an incorrect result. So at least that is OK.

thomasrolinger commented 2 years ago

Closing this for now, since I don't see any reasonable solution, and what we have now will not produce an incorrect program (it's still a bit aggressive and won't optimize in a case where it could theoretically still do the optimization).