arporter / habakkuk

Fortran code analysis for performance prediction
11 stars 0 forks source link

Support loop-unrolling for kernels containing indirect array accesses #5

Closed arporter closed 7 years ago

arporter commented 8 years ago

Although the parser code in parse2003.py recognises indirect array accesses, it currently flattens such expressions into strings. i.e my_array(map(i)+1) results in an array index stored as "map(i)+1". If i is the loop variable and we wish to unroll the loop then this is going to cause problems.

This issue can possibly be thought of as identifying contiguous and non-contiguous array accesses for the purposes of memory-bandwidth usage and potential SIMD vectorisation.

arporter commented 7 years ago

Created new branch 'indirect_array_accesses' for work on this issue. Took example fortran from pert_pressure_gradient kernel and used to make new test. Example code is:

do k = 0, nlayers-1
   do df = 1, ndf_w3
      rho_e(df)     = rho    ( map_w3(df) + k )
      rho_ref_e(df) = rho_ref( map_w3(df) + k )
   end do
end do`

Habakkuk should find 4 array accesses requiring 4 distinct cache lines but instead finds just 2.

arporter commented 7 years ago

Issue boils down to dag.cache_lines() which assumes than any differences in the first index of an array access will all be to the same cache line. This of course breaks down as soon as the expression for the first index is not simply something like "i+1". We need to be able to distinguish between e.g. my_array(my_map(i)+1,j) and my_array(my_map(i+1),j).

arporter commented 7 years ago

All tests now pass (or xfail). Will check coverage next...

arporter commented 7 years ago

This issue was subsumed into #15 because processing NEMO code required the support in this issue.