Thin LTO ignores -fno-vectorize and -fno-slp-vectorize

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

http://llvm.org

Other

29.37k stars 12.14k forks source link

Thin LTO ignores -fno-vectorize and -fno-slp-vectorize #44779

Open davezarzycki opened 4 years ago

davezarzycki commented 4 years ago


Bugzilla Link	45434
Version	unspecified
OS	Linux
CC	@davezarzycki,@dexonsmith,@fhahn,@smithp35,@kawashima-fj

Extended Description

Currently, there is no way to disable loop or SLP vectorization when using LTO. This is unfortunate when targeting AVX2 or AVX512 machines where poor cost modeling by the auto-vectorizers becomes painfully obvious.

The normal workaround doesn't work either, namely: -Wl,-plugin-opt,-vectorize-loops=false and -Wl,-plugin-opt,-vectorize-slp=false. Teresa Johnson confirmed via private email that this is the case.

llvmbot commented 4 years ago

I fixed the issue where the internal options could not be used to disable vectorization as a workaround in the ThinLTO backends just now in 33ffb62e23e7a7bece5618d5a7b54bdb401d0bcf. With this, you can use -Wl,-plugin-opt,-vectorize-loops=false and -Wl,-plugin-opt,-vectorize-slp=false to disable the passes.

llvmbot commented 4 years ago

Yes, that is the best long term solution (marking the loops in the IR).

The other thing I hand mentioned in my email to David was about not having the usual workaround of passing the internal option via the linker in this case because of the way the existing internal options (-vectorize-loops, -vectorize-slp) are used. They are used to initialize the pass manager flags (they themselves are initialized to true), but are subsequently unconditionally overridden to true by the linker code that sets up the LTO configurations. This one should be easy enough to fix (check the internal options during the passes instead, and have them override what is coming in). I can fix this pretty quickly, so that at least the internal options do what one would expect, and be a workaround until we have the IR based solution.

davezarzycki commented 4 years ago

In short, yes. In general, I like the LTO user experience, so if -fno-vectorize or -fno-slp-vectorize (etc) are passed to a TU, then it seems reasonable for one to assume that LTO will not override that decision.

fhahn commented 4 years ago

This is similar to -fno-unroll-loops being ignored during LTO I think. There's some related discussion in https://reviews.llvm.org/D76916 and there is a proposed fix using metadata to disable unrolling for loops in a TU via metadata: https://reviews.llvm.org/D77058

Similar fixes for -fno-vectorize/-fno-slp-vectorize would mean that loops in files compiled with -fno-vectorize won't be vectorized during LTO. Would that address your issue? -fno-vectorize would be very straight-forward, as there already exists the appropriate metadata, but I think for the SLP vectorizer we would have to add a new function attribute.