Closed thomasrolinger closed 2 years ago
This has been addressed: https://github.com/thomasrolinger/chapel/commit/c8fc84a4dc2ef6bf222a7bce8f0a2e7217070817
It went more or less like discussed above. When we check for whether the optimization is valid during pre-folding, we added an additional check if A
has record elements. We look specifically for either ref t = A[B[i]]
or var t = A[B[i]]
. From that, we can get the Symbol for t
and look at its SymExprs. We look for field accesses, which use the _mt
method token. From that, we can get the field being accessed and determine whether it is a POD or not. If it is not a POD, we invalidate the optimization. If it is a POD, we add its field location/position to a global vector.
When we create the call to executePreamble
, we check to see if A
has record elements. If it does, we create the following call: executePreambleRecords(A, replArrays, fieldPos, fieldPos2, ...)
. That version of executePreamble
takes in a variable number of arguments for the field positions. We then only do aggregation/updates on those fields.
We also support this detection when t
is used within a function that is called within the forall.
Tests added for this feature:
ie_test_A_recordsRef.chpl
ie_test_A_recordsVar.chpl
ie_test_A_recordsVarInFunc.chpl
ie_test_A_recordsNonPODAccess.chpl
ie_test_A_recordsNonPODAccessInFunc.chpl
When
A
is an array of records, we have some code to handle how we do the updates before each executor. Since we rely on aggregation for these updates, we need to ignore all non-POD fields in the record. This can be quite slow, as we will be copying over fields that we may never need to access.For a performance view, we can do better. We could determine which fields of a record are accessed within the
forall
loop and only update/aggregate on those fields. I did some experimenting a while ago with filling out an array of field locations and iterating over those to only copy over ones we matched on. So the compiler would need to generate this array.This can get complicated though. If the
forall
loop has a function call that takes in the result ofA[B[i]]
as an argument, then we wouldn't know which fields are accessed without doing inter-procedural analysis. That hasn't stopped us before though.The biggest complication is recognizing the fields being accessed, and ensuring that they are tied to the element from
A
. My initial guess would be something like this, which happens during prefolding:A
's elements are records. We do this during prefolding. If they are not records, we don't bother with the rest of these steps.PRIM_MOVE
calls where the RHS is theA[B[i]]
call.PRIM_MOVE
will be something likecall_tmp
most likely. So we'll want to look at cases where that is used within a field accessor method. From there, we should be able to jot down the name of the field.call_tmp
is given as an argument to a function call, we'll need to recursively check it within that function.But beyond performance concerns, this also could be a correctness concern. As of now, and even with the approach mentioned above, we ignore non-POD fields. If the
forall
attempts to access those non-POD fields fromA[B[i]]
, then we should not attempt the optimization. Since we will not copy them over, we will be in trouble if we do the optimization and they are accessed.