devitocodes / devito

DSL and compiler framework for automated finite-differences and stencil computation
http://www.devitoproject.org
MIT License
557 stars 226 forks source link

compiler: Improve IndexDerivative lowering #2288

Closed FabioLuporini closed 9 months ago

FabioLuporini commented 9 months ago

Tiny enhancement to avoid useless stupid temporary variables in the generated code

FabioLuporini commented 9 months ago

"Fabio, have you thought about just performing a round of cse() instead"

"Yes I have, unfortunately it's not that easy because CSE is a Cluster-local pass, which doesn't work well here"

codecov[bot] commented 9 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (c888cee) 86.76% compared to head (df4e53d) 86.76%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #2288 +/- ## ======================================= Coverage 86.76% 86.76% ======================================= Files 229 229 Lines 42884 42893 +9 Branches 7951 7953 +2 ======================================= + Hits 37207 37216 +9 Misses 5002 5002 Partials 675 675 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

georgebisbas commented 9 months ago

"Fabio, have you thought about just performing a round of cse() instead"

"Yes I have, unfortunately it's not that easy because CSE is a Cluster-local pass, which doesn't work well here"

Why does it not work well? Not sure I understand TBH

FabioLuporini commented 9 months ago

Why does it not work well? Not sure I understand TBH

lower_index_derivative might produce:

Cluster([
  r1 = 0
  r1 += <lowered index derivative>
  r0 = r1
])

CSE would be able to see that r0 is unnecessary here and safely removable. But r0 is needed in later Clusters, so it would produce wrong code ultimately

Note that this is just a simplified example

You would need to understand lower_index_derivatives fully to appreciate all the different flavors of this issue

georgebisbas commented 9 months ago

Why does it not work well? Not sure I understand TBH

lower_index_derivative might produce:

Cluster([
  r1 = 0
  r1 += <lowered index derivative>
  r0 = r1
])

CSE would be able to see that r0 is unnecessary here and safely removable. But r0 is needed in later Clusters, so it would produce wrong code ultimately

Note that this is just a simplified example

You would need to understand lower_index_derivatives fully to appreciate all the different flavors of this issue

Thanks