Open georgebisbas opened 3 years ago
@Leitevmd have you guys ever encountered this issue? this could potentially be relevant to you
After merging https://github.com/devitocodes/devito/pull/2226 the generated code for the above MFE is:
#pragma acc parallel loop present(A,b,x)
for (int i = i_m; i <= i_M; i += 1)
{
for (int j = j_m; j <= j_M; j += 1)
{
b[i] += x[j]*A[i][j];
}
}
instead of
START_TIMER(section0)
#pragma acc parallel loop collapse(2) reduction(+:b[0:b_vec->size[0]]) present(A,b,x)
for (int i = i_m; i <= i_M; i += 1)
{
for (int j = j_m; j <= j_M; j += 1)
{
b[i] += x[j]*A[i][j];
}
}
STOP_TIMER(section0,timers)
The above generated code is working but probably not optimal. Should we close this issue or rename it so as to improve with a reduction sub-pass?
Failing on the reduction clause to reproduce: pgi+openacc
python3 misc/linalg.py mat-vec
Some discussion: https://devitocodes.slack.com/archives/CQ0AT90R0/p1607003860367600