Open tkoenig1 opened 1 year ago
This has to do with disabling loop-strength-reduction. If LSR is enabled, one gets equivalent code for both examples. We need to find a way to enable some of LSR, but not the part that ruins VVM optimations.
This has to do with disabling loop-strength-reduction. If LSR is enabled, one gets equivalent code for both examples. We need to find a way to enable some of LSR, but not the part that ruins VVM optimations.
Hm, a vague idea (but I don't know about LLVM, so...)
It could be beneficial to run the VEC pass quite early, to detect opportunities. The vectorized loops could then be annotated so they can be excluded from optimizations like loop unrolling which are detrimental for VEC/LOOP. Loop unrolling from an outer loop could be done, though. Strength reduction within a VEC loop would then also be OK, as would all the normal strateges for loops that cannot be vectorized.
Does this sound at all reasonable?
Here's something I just noticed, a possible enhancement (so it won't be forgotten).
The functoins foo and bar are equivalent, but tranlated quite differently:
foo is
and bar is
The issue appears to be recognizting a[i] and a[j] as expressions which can be hoisted out of the loop.
Compile script is