thomasrolinger / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
0 stars 0 forks source link

(6) Task: Determine which fields of a record are accessed in the forall #7

Closed thomasrolinger closed 2 years ago

thomasrolinger commented 2 years ago

When A is an array of records, we have some code to handle how we do the updates before each executor. Since we rely on aggregation for these updates, we need to ignore all non-POD fields in the record. This can be quite slow, as we will be copying over fields that we may never need to access.

For a performance view, we can do better. We could determine which fields of a record are accessed within the forall loop and only update/aggregate on those fields. I did some experimenting a while ago with filling out an array of field locations and iterating over those to only copy over ones we matched on. So the compiler would need to generate this array.

This can get complicated though. If the forall loop has a function call that takes in the result of A[B[i]] as an argument, then we wouldn't know which fields are accessed without doing inter-procedural analysis. That hasn't stopped us before though.

The biggest complication is recognizing the fields being accessed, and ensuring that they are tied to the element from A. My initial guess would be something like this, which happens during prefolding:

But beyond performance concerns, this also could be a correctness concern. As of now, and even with the approach mentioned above, we ignore non-POD fields. If the forall attempts to access those non-POD fields from A[B[i]], then we should not attempt the optimization. Since we will not copy them over, we will be in trouble if we do the optimization and they are accessed.

thomasrolinger commented 2 years ago

This has been addressed: https://github.com/thomasrolinger/chapel/commit/c8fc84a4dc2ef6bf222a7bce8f0a2e7217070817

It went more or less like discussed above. When we check for whether the optimization is valid during pre-folding, we added an additional check if A has record elements. We look specifically for either ref t = A[B[i]] or var t = A[B[i]]. From that, we can get the Symbol for t and look at its SymExprs. We look for field accesses, which use the _mt method token. From that, we can get the field being accessed and determine whether it is a POD or not. If it is not a POD, we invalidate the optimization. If it is a POD, we add its field location/position to a global vector.

When we create the call to executePreamble, we check to see if A has record elements. If it does, we create the following call: executePreambleRecords(A, replArrays, fieldPos, fieldPos2, ...). That version of executePreamble takes in a variable number of arguments for the field positions. We then only do aggregation/updates on those fields.

We also support this detection when t is used within a function that is called within the forall.

Tests added for this feature: