llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.15k stars 12.02k forks source link

SLP vectorizer ignores slp-max-reg-size #110749

Open TocarIP opened 1 month ago

TocarIP commented 1 month ago

I'm trying to limit generation of wide AVX instruction to reduce frequency impact/performance regression. For the following example (consecutive FP division): https://godbolt.org/z/reP9c78cM I get vector division :vdivpd %ymm0, %ymm1, %ymm0 with 256-bit wide register. I've checked IR and SLP indeed generates %5 = fdiv <4 x double> %2, %4. When I try to limit register size to 128 I get the same results. Even when building with -mllvm -slp-max-reg-size=1 which should basically remove any slp vectorization completely. Wide AVX is know to cause significant performance regression from reduced frequency on some CPUs (especially older ones)

alexey-bataev commented 1 month ago

SLP vectorizer may generate long vectors, even if -slp-max-reg-size is specified. It relies on the codegen ability of long vectors splitting. -slp-max-reg-size does not limit the size of vector, but the size of vector registers. The vector itself still may span across several vector registers.