Closed thomasrolinger closed 2 years ago
This has been addressed: https://github.com/thomasrolinger/chapel/commit/75018d5a21e11e7ebdf730b99df4c8f309a641c4
To make this fully work and integrate it into the optimization, we also had to address issue of invalidating a particular call site and having it call a cloned version of the function that contains the forall
, if the forall
was inside of a function.
We leverage the same code we had before for cloning call sites, but changed it slightly. Before we do our normal pass over all the forall
s looking for candidates, we do a "pre-pass". For each forall
, we check to see whether it is in a function. If it is, we look at each call site of that function. Any call site that is nested in one of the structures mentioned in this issue (forall
, coforall
, cobegin
, begin
), we add the call site to a vector. We also check here that the call site is nested in a for
loop, or the function itself has the forall
nested in a for
loop. If we don't find such a scenario, the call site is invalid.
We then map this vector of invalid call sites to the function that contains the original forall
we were analyzing. We ensure that the forall
would be optimized (i.e., it has candidate accesses). If it does, then we create a clone of the function and update all the call sites we found to call the clone instead.
Now, back to our normal pass, we ignore any forall
that is within a cloned function. For any other forall
, we do another check that it is not enclosed in one of the relevant structures addressed in this issue. This check does not go across call sites, so it handles what we did not check when doing the pre-pass function cloning.
We added two tests to the IE_SMALL_TESTS
suite for these features: ie_test_someInvalidCallSites.chpl
and ie_test_invalidedNestedStructures.chpl
. However, both of these produce non-deterministic output, given the nature of the tests. So we put them in a special directory so they are not ran with all the others, since we can't validate their output.
It is safe to ensure that multiple tasks are not executing the same forall. This could do nasty stuff to the replicated arrays, even though the original array is read-only. Specifically, two tasks could try to update the same replicated arrays and overwrite each other.
This should be fairly easy to check for.