Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Thin LTO ignores -fno-vectorize and -fno-slp-vectorize #44404

Open Quuxplusone opened 4 years ago

Quuxplusone commented 4 years ago
Bugzilla Link PR45434
Status NEW
Importance P normal
Reported by David Zarzycki (dave@znu.io)
Reported on 2020-04-05 04:09:48 -0700
Last modified on 2020-04-14 19:24:05 -0700
Version unspecified
Hardware PC Linux
CC dave@znu.io, dexonsmith@apple.com, florian_hahn@apple.com, llvm-bugs@lists.llvm.org, smithp352@googlemail.com, t-kawashima@fujitsu.com, tejohnson@google.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also

Currently, there is no way to disable loop or SLP vectorization when using LTO. This is unfortunate when targeting AVX2 or AVX512 machines where poor cost modeling by the auto-vectorizers becomes painfully obvious.

The normal workaround doesn't work either, namely: -Wl,-plugin-opt,-vectorize-loops=false and -Wl,-plugin-opt,-vectorize-slp=false. Teresa Johnson confirmed via private email that this is the case.

Quuxplusone commented 4 years ago

This is similar to -fno-unroll-loops being ignored during LTO I think. There's some related discussion in https://reviews.llvm.org/D76916 and there is a proposed fix using metadata to disable unrolling for loops in a TU via metadata: https://reviews.llvm.org/D77058

Similar fixes for -fno-vectorize/-fno-slp-vectorize would mean that loops in files compiled with -fno-vectorize won't be vectorized during LTO. Would that address your issue? -fno-vectorize would be very straight-forward, as there already exists the appropriate metadata, but I think for the SLP vectorizer we would have to add a new function attribute.

Quuxplusone commented 4 years ago

In short, yes. In general, I like the LTO user experience, so if -fno-vectorize or -fno-slp-vectorize (etc) are passed to a TU, then it seems reasonable for one to assume that LTO will not override that decision.

Quuxplusone commented 4 years ago

Yes, that is the best long term solution (marking the loops in the IR).

The other thing I hand mentioned in my email to David was about not having the usual workaround of passing the internal option via the linker in this case because of the way the existing internal options (-vectorize-loops, -vectorize-slp) are used. They are used to initialize the pass manager flags (they themselves are initialized to true), but are subsequently unconditionally overridden to true by the linker code that sets up the LTO configurations. This one should be easy enough to fix (check the internal options during the passes instead, and have them override what is coming in). I can fix this pretty quickly, so that at least the internal options do what one would expect, and be a workaround until we have the IR based solution.

Quuxplusone commented 4 years ago

I fixed the issue where the internal options could not be used to disable vectorization as a workaround in the ThinLTO backends just now in 33ffb62e23e7a7bece5618d5a7b54bdb401d0bcf. With this, you can use -Wl,-plugin-opt,-vectorize-loops=false and -Wl,-plugin-opt,-vectorize-slp=false to disable the passes.