devitocodes / devito

DSL and compiler framework for automated finite-differences and stencil computation
http://www.devitoproject.org
MIT License
544 stars 222 forks source link

linalg.py failing with PGI openacc #1528

Open georgebisbas opened 3 years ago

georgebisbas commented 3 years ago

Failing on the reduction clause to reproduce: pgi+openacc

export DEVITO_LANGUAGE=openacc
export DEVITO_PLATFORM=nvidiaX
export DEVITO_ARCH=pgcc
export DEVITO_LOGGING=DEBUG #optional

python3 misc/linalg.py mat-vec

Some discussion: https://devitocodes.slack.com/archives/CQ0AT90R0/p1607003860367600

FabioLuporini commented 3 years ago

@Leitevmd have you guys ever encountered this issue? this could potentially be relevant to you

georgebisbas commented 9 months ago

After merging https://github.com/devitocodes/devito/pull/2226 the generated code for the above MFE is:

  #pragma acc parallel loop present(A,b,x)
  for (int i = i_m; i <= i_M; i += 1)
  {
    for (int j = j_m; j <= j_M; j += 1)
    {
      b[i] += x[j]*A[i][j];
    }
  }

instead of

  START_TIMER(section0)
  #pragma acc parallel loop collapse(2) reduction(+:b[0:b_vec->size[0]]) present(A,b,x)
  for (int i = i_m; i <= i_M; i += 1)
  {
    for (int j = j_m; j <= j_M; j += 1)
    {
      b[i] += x[j]*A[i][j];
    }
  }
  STOP_TIMER(section0,timers)
georgebisbas commented 9 months ago

The above generated code is working but probably not optimal. Should we close this issue or rename it so as to improve with a reduction sub-pass?