Open Quuxplusone opened 4 years ago
This is similar to -fno-unroll-loops being ignored during LTO I think. There's some related discussion in https://reviews.llvm.org/D76916 and there is a proposed fix using metadata to disable unrolling for loops in a TU via metadata: https://reviews.llvm.org/D77058
Similar fixes for -fno-vectorize/-fno-slp-vectorize would mean that loops in files compiled with -fno-vectorize won't be vectorized during LTO. Would that address your issue? -fno-vectorize would be very straight-forward, as there already exists the appropriate metadata, but I think for the SLP vectorizer we would have to add a new function attribute.
In short, yes. In general, I like the LTO user experience, so if -fno-vectorize or -fno-slp-vectorize (etc) are passed to a TU, then it seems reasonable for one to assume that LTO will not override that decision.
Yes, that is the best long term solution (marking the loops in the IR).
The other thing I hand mentioned in my email to David was about not having the usual workaround of passing the internal option via the linker in this case because of the way the existing internal options (-vectorize-loops, -vectorize-slp) are used. They are used to initialize the pass manager flags (they themselves are initialized to true), but are subsequently unconditionally overridden to true by the linker code that sets up the LTO configurations. This one should be easy enough to fix (check the internal options during the passes instead, and have them override what is coming in). I can fix this pretty quickly, so that at least the internal options do what one would expect, and be a workaround until we have the IR based solution.
I fixed the issue where the internal options could not be used to disable vectorization as a workaround in the ThinLTO backends just now in 33ffb62e23e7a7bece5618d5a7b54bdb401d0bcf. With this, you can use -Wl,-plugin-opt,-vectorize-loops=false and -Wl,-plugin-opt,-vectorize-slp=false to disable the passes.
Currently, there is no way to disable loop or SLP vectorization when using LTO. This is unfortunate when targeting AVX2 or AVX512 machines where poor cost modeling by the auto-vectorizers becomes painfully obvious.
The normal workaround doesn't work either, namely:
-Wl,-plugin-opt,-vectorize-loops=false and -Wl,-plugin-opt,-vectorize-slp=false
. Teresa Johnson confirmed via private email that this is the case.