RRZE-HPC / kerncraft

Loop Kernel Analysis and Performance Modeling Toolkit
GNU Affero General Public License v3.0
86 stars 24 forks source link

Arrays not swapped between sweeps in compilable code #115

Closed rrzeschorscherl closed 5 years ago

rrzeschorscherl commented 5 years ago

Version 0.7.3

With stencil codes the source and target arrays should be swapped after each sweep in order to avoid unwanted cache effects and better mimic "real" code behavior. This is not done in kerncraft's compilable code. I understand that this is difficult to achieve in the general case because it may not be evident which arrays are supposed to be swapped if there is more than one source and one target array. Hence I suggest to add some kind of macro support so the permutation of arrays after each sweep can be configured. Something like:

for(int j=1; j<M-1; ++j)
    for(int i=1; i<N-1; ++i)
        b[j][i] = ( a[j][i-1] + a[j][i+1]
                  + a[j-1][i] + a[j+1][i]) * s;
//PERM(a,b)(b,a)

The comment could be converted into a sequence of statements that carry out the desired permutation. This would not break existing benchmarks, and no automated mechanism would be needed to figure out the permutation.

cod3monk commented 5 years ago

Would it be okay if I always swap all arrays of same type and dimensions in a round-robin fashion? Otherwise I could allow assignment statements beyond the loop-nest. Comments are not an option, because they get eliminated early on and don't make it into the parser.

cod3monk commented 5 years ago

Kerncraft now supports calls to the swap function at the end of the kernel file. The swap function is defined in headers/kerncraft.h.

For example:

double a[M][N][N];
double b[M][N][N];
double s;

for(int k=1; k<M-1; ++k)
    for(int j=1; j<N-1; ++j)
        for(int i=1; i<N-1; ++i)
            b[k][j][i] = ( a[k][j][i-1] + a[k][j][i+1]
                         + a[k][j-1][i] + a[k][j+1][i]
                         + a[k-1][j][i] + a[k+1][j][i]) * s;

swap(a, b);