I am trying to get an openmp for loop with reduction to compile and run. Based on the documentation and examples I would have thought something like:
pragma omp parallel for reduction(+:y) ADOLC_OPENMP
for ( i = 0 ; i < n ; i++ ) {
y += x[i]*x[i];
y+= cos(x[i]);
}
would work, but the g++ compiler complains:
error: user defined reduction not found for ‘y’
so it's not figuring out how to reduce y...
playing around with things:
pragma omp ADOLC_OPENMP parallel for reduction(+:y)
for ( i = 0 ; i < n ; i++ ) {
y += x[i]*x[i];
y+= cos(x[i]);
}
does compile and run (correctly it seems), but is only using a single processor.
If I just ignore the reduction (yes, it will be wrong) and go back to the original syntax:
pragma omp parallel for ADOLC_OPENMP
for ( i = 0 ; i < n ; i++ ) {
y += x[i]*x[i];
y+= cos(x[i]);
}
will compile (no error: user defined reduction not found for ‘y’ ), and it does use all the threads given it...but of course it gives the incorrect answer.
Is ADOL-C forcing the reduction down to 1 thread? And am I employing the syntax correctly?
I am trying to get an openmp for loop with reduction to compile and run. Based on the documentation and examples I would have thought something like:
pragma omp parallel for reduction(+:y) ADOLC_OPENMP
for ( i = 0 ; i < n ; i++ ) {
}
would work, but the g++ compiler complains: error: user defined reduction not found for ‘y’
so it's not figuring out how to reduce y...
playing around with things:
pragma omp ADOLC_OPENMP parallel for reduction(+:y)
for ( i = 0 ; i < n ; i++ ) {
}
does compile and run (correctly it seems), but is only using a single processor.
If I just ignore the reduction (yes, it will be wrong) and go back to the original syntax:
pragma omp parallel for ADOLC_OPENMP
for ( i = 0 ; i < n ; i++ ) {
}
will compile (no error: user defined reduction not found for ‘y’ ), and it does use all the threads given it...but of course it gives the incorrect answer.
Is ADOL-C forcing the reduction down to 1 thread? And am I employing the syntax correctly?