Open davidbolvansky opened 4 years ago
In OpenMP terminology, this is called conditional lastprivate. It's best to think separately from conditional reduction ----- but I admit that our initial proposal to OpenMP on the conditional lastprivate included "assignment reduction" perspective.
Took us a while to correctly implement this in ICC. Things to watch out for: 1) assignment under condition may not happen at all. 2) assigned value may be used inside the loop (use is still dominated by assignment). 3) need to find out which element in the vector of "privatized" res is last assigned.
Beyond the basic implementation, optimizations are possible for some cases: The first example here, ICC generated ASM code is just checking whether the condition becomes true inside the loop and the actual assignment happens through cmovne after the loop. The same technique cannot be applied when the assigned value is b[i].
Quite surprised this loop is not vectorized too:
#define BONUS 5
#define T 24
int
foo (int aval)
{
int res = 0;
for (int i=0; i<N; i++)
{
if (a[i] > T)
res += BONUS /* bonus */;
}
return res;
}
if #define BONUS 1, code is vectorized.. https://godbolt.org/z/SyjmPf
First example (res = aval) is vectorized now.
Extended Description
Clang -O3 -march=haswell: loop not vectorized [-Rpass-missed=loop-vectorize]
ICC -O3 -march=haswell:
https://godbolt.org/z/AjRQNY
This loop is similar, also not vectorized:
https://godbolt.org/z/AU_5rf