Open vfdff opened 4 months ago
Rpass remarks that the loop is not vectorized because it could not determine the number of loop iterations. I think this is due to its inability to decide whether there exists an alias between end
and crd
, and thus it cannot parallelize the iterations.
Thanks for your comment.
As the #pragma clang loop vectorize(assume_safety) can assume there is no data dependencies for the following loop, does it mean there is no dependencies between end
and crd
, .ie they are not alias ?
I found that it can be vectorized if I load the value before the loop and use this value in loop condition.
int end_value = end[residue_i];
#pragma clang loop vectorize(assume_safety)
for (int atom_i = start[residue_i]; atom_i < end_value; atom_i++)
{
crd[atom_i] = crd[atom_i] + 8;
}
Also, if I update end_value
with end[residue_i]
in the loop body, it can not be vectorized as well.
#pragma clang loop vectorize(assume_safety)
for (int atom_i = start[residue_i]; atom_i < end_value; atom_i++)
{
crd[atom_i] = crd[atom_i] + 8;
end_value = end[residue_i];
}
My personal guess is that #pragma clang loop vectorize(assume_safety)
indicates to the compiler that the loop contains no data dependencies between iterations that would prevent vectorization. However, there could be a potential dependency between end[residue_i]
and crd[atom_i]
within a single iteration.
You might want to refer to the source code for more detailed information. There are several return statements before IsAnnotatedParallel skips the memory dependence checks, as seen in this section of the LLVM project. It's likely that one of those checks failed in this case, preventing vectorization. I'm not very familiar with this part, so apologies for not exploring it further.
void foo_noalias (int residue_i, int start, int end, int * restrict crd) {
pragma clang loop vectorize(assume_safety)
}