clang -O4 drops other -mavx2 and possibly other flags

llvmbot commented 10 years ago


Bugzilla Link	18808
Resolution	FIXED
Resolved on	Mar 30, 2015 15:16
Version	3.3
OS	All
Attachments	A terminal transcript with a test case
Reporter	LLVM Bugzilla Contributor
CC	@ahatanak,@dexonsmith,@echristo,@isanbard,@sunfishcode,@rotateright

Extended Description

I found that

clang -O4 -mavx2 *.c

sets AVX2 in the frontend, but not during link-time optimization. The same may be true for other flags. This causes problems with code that tries to detect whether AVX2 (or possibly other features) are supported, since it will emit code designed for those features but they will not be present during link-time optimization.

The terminal transcript shows the bug in action. When compiled with -O3, AVX2 instructions are emitted, and when compiled with -O4, they are not. The transcript does not show that AVX2 is defined in the C file, but in fact it is.

echristo commented 9 years ago

OK. Let's look at things this way:

1) Whether or not to allow changing of target-cpu/target-feature/triple at link time code generation.

Not convinced here of the facility to do so. Could just recompile the individual bitcode files to get what you want, but there are some users that are trying to ship bitcode (as crazy as that sounds).

2) How to pass other sorts of options to the backend for code generation

-ffoo options -fno-foo options. I.e. -fno-inline, etc. I think this is really pretty important from the user POV. It affects things at a more global level.

3) The llvm developer debugging story

It's useful for llvm developers to be able to more accurately debug a set of IR using bisection or being able to turn off code generation options. Should this be done at the command level (i.e. infrastructure that clang and llc etc could even share), or should it be done at an llvm IR rewriting level? Don't know. I kind of want a rewriter, but I'm not wedded to any particular answer.

I'm going to go ahead and redirect this to the mailing list and close the bug so that the rest of this discussion can go there. :)

echristo commented 9 years ago

You brought it up and haven't even replied to the rest of my commentary.

3f18db19-85d0-42b5-b58f-dbfbd8cbce51 commented 9 years ago

llc is a low-level tool for LLVM developers.

Sure, the general idea still holds if you're going to allow overrides via the tools.

I disagree. There are two distinct questions here:

llc: Should LLVM developers be able to fiddle with codegen options to isolate the cause of bugs?
clang: Should clang support end-users choosing one backend at compile time, and choosing a different backend (or set of backend options) at link time?

IMO, these questions are completely unrelated (putting aside whether infrastructure could be shared).

echristo commented 9 years ago

llc is a low-level tool for LLVM developers.

Sure, the general idea still holds if you're going to allow overrides via the tools.

3f18db19-85d0-42b5-b58f-dbfbd8cbce51 commented 9 years ago

shrug That's the question behind things like the llc overrides? :)

llc is a low-level tool for LLVM developers.

echristo commented 9 years ago

shrug That's the question behind things like the llc overrides? :)

That said I was actually envisioning something like:

clang -emit-llvm foo.c -o foo.bc ...

clang -O3 -flto all.bc -arch x86_64h -o haswell_slice clang -O3 -flto all.bc -arch x86_64 -o x86_64_slice

for the same set of bitcode files. But given the front end language restrictions on doing anything actually interesting there it's not too much of a constraint.

Another usage is the (admittedly one I don't think we want to support) halide one that I discovered this week:

clang foo.c -emit-llvm foo.bc clang -target aarch64-linux-gnu foo.bc -O3 -o foo.aarch64 clang -target x86_64-linux-gnu foo.bc -O3 -o foo.x86_64 ...

I've since convinced them to use the pnacl sort of thing for more target independent code generation at the moment. It's a use case that could be thought about more though.

3f18db19-85d0-42b5-b58f-dbfbd8cbce51 commented 9 years ago

We should probably allow people to override the defaults on the command line though.

Should that be a fully supported driver option? It feels like with the per function flag working it is best to just tell people to pass -mavx to the -c invocation.

I agree with Rafael. If overriding the defaults has any effect, isn't that a bug?

llvmbot commented 9 years ago

We should probably allow people to override the defaults on the command line though.

Should that be a fully supported driver option? It feels like with the per function flag working it is best to just tell people to pass -mavx to the -c invocation.

echristo commented 9 years ago

So, this is actually two different bugs:

a) that you can't pass code generation options to the lto process, b) that you can't do per function generation of code to take into account things like -mavx2 being used on a single translation unit.

b has been fixed as you'll see from:

http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150323/126563.html

We should probably allow people to override the defaults on the command line though.

isanbard commented 10 years ago

Hi Mike,

This is a well-known bug. The problem is that during LTO the command line flags (and other information) are not passed to the code generator. There is a proposal in place to fix this, but it's not finished.

llvm / llvm-project

clang -O4 drops other -mavx2 and possibly other flags #19182

Extended Description