Open awnawab opened 1 month ago
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/329/index.html
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 95.19%. Comparing base (
d5a8e6c
) to head (4974f6b
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Lukas M. demonstrated that for accumulation patterns of the type:
a compiler cannot rule out the possibility that
n1
andn2
do not in fact alias the same location. In such a case, it is unable to run the loads and stores out-of-order.This PR contributes a pragma assisted transformation to split the above into separate reads and writes, thereby removing the dependency between subsequent loads and stores and allowing the compiler to optimise more effectively.