Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

x86/x64 clang appears not to respect __attribute__((noinline)) with -O1 #26544

Open Quuxplusone opened 8 years ago

Quuxplusone commented 8 years ago
Bugzilla Link PR26545
Status NEW
Importance P normal
Reported by Tom Seddon (llvmbugs@tomseddon.plus.com)
Reported on 2016-02-09 09:12:52 -0800
Last modified on 2021-11-09 02:55:00 -0800
Version 3.7
Hardware PC MacOS X
CC chandlerc@gmail.com, christophe.monat@st.com, dblaikie@gmail.com, dmajor@bugmail.cc, florian_hahn@apple.com, francois.de-ferriere@st.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments main.c (187 bytes, text/x-c)
Blocks
Blocked by
See also PR41474, PR46463
Created attachment 15863
code

Repro steps:

Compile the attached code with -O1 and generate assembly language.

- From shell: clang -O1 -S main.c (I tested this with Xcode's clang - "Apple
LLVM version 7.0.2 (clang-700.1.81); Target: x86_64-apple-darwin14.5.0; Thread
model: posix")

- See it on gcc explorer: http://goo.gl/Im4c95 (link uses x86 clang 3.7.1)

Expected result:

Generated code contains a call to the Test function somewhere, since Test has
the noinline attribute. (This even though -O1 was specified, and the Test
function is an ideal candidate for inlining.)

Actual result:

The compiler appears to have inlined the Test function (or otherwise determined
somehow that the result of the call would be -55).

Notes:

- I couldn't find any clang documentation for noinline so I'm just making
assumptions about what it does (sorry)

- The gcc documentation for noinline suggests this sort of problem is a
possibility - see https://gcc.gnu.org/onlinedocs/gcc/Common-Function-
Attributes.html#index-g_t_0040code_007bnoinline_007d-function-attribute-3218 -
but the workaround it mentions has no effect in clang

- If you force the compiler to generate the Test function (e.g., by adding
something like ``printf("%p\n",(void *)&Test);''), the call still isn't
generated
Quuxplusone commented 8 years ago

Attached main.c (187 bytes, text/x-c): code

Quuxplusone commented 8 years ago

Yeah, I think this is interprocedural constant propagation, plus attribute detection (detecting that the function is pure & thus an unused call can be omitted).

I /think/ there's another attribute for documenting that the optimizer shouldn't do anything interprocedural with the function, but I don't recall what it is.

Quuxplusone commented 8 years ago
(In reply to comment #1)
> Yeah, I think this is interprocedural constant propagation, plus attribute
> detection (detecting that the function is pure & thus an unused call can be
> omitted).

Thanks - indeed, if I make the Test function more involved (e.g., have it
return the result of another printf call), then things start to work more as
I'd expect.

More gcc explorer links:

No attributes - call inlined: http://goo.gl/Usoi0q
__attribute__((noinline)) - call not inlined: http://goo.gl/dAiUoY

> I /think/ there's another attribute for documenting that the optimizer
> shouldn't do anything interprocedural with the function, but I don't recall
> what it is.

That gave me a keyword to look for in
http://clang.llvm.org/docs/AttributeReference.html, but nothing obvious turned
up.

Regards,

--Tom
Quuxplusone commented 6 years ago
I was about to file another instance of this and was pointed here by the
duplicate detector. This still happens with trunk r336407.

(In reply to David Blaikie from comment #1)
> I /think/ there's another attribute for documenting that the optimizer
> shouldn't do anything interprocedural with the function, but I don't recall
> what it is.

Adding __attribute__((optnone)) forces the call to be generated in our case,
but that's a pretty big hammer. Do you have any better suggestions for a
workaround, or any expected timeframe for a fix? Thanks!
Quuxplusone commented 6 years ago

hey Chandler - I know we talked a while back about some bugs related to optnone that I think I fixed (yeah, bunch of GlobalsModRef and FunctionAttrs+OptNone situations). Any idea what, if anything, should be done in this noinline situation? This does sort of look like inlining, even if it isn't done by the inliner... (admittedly GCC has the same caveat, but a clearer workaround - I haven't checked why an asm("") is insufficient to hinder Clang here)

Quuxplusone commented 5 years ago
I stumbled on the same problem a few days ago.
I tracked the problem down to inter-procedural constant propagation, as
mentioned in the previous posts.
There is a simple fix to do for this: The "naked" attribute is already checked
to prevent inter-procedural constant propagation, I propose to add also a check
on the "noinline" attribute.
I will submit a patch under Phabricator.
Quuxplusone commented 5 years ago

I believe optnone is the best tool we have to shut down all interprocedural optimizations.

Adding something that only targets interprocedural cases would be challenging. It's possible we could do this, but it would need very clear and strong motivation. It would add a lot of complexity to reasoning about IR semantics in an area that is already really complex (derefinement, etc, etc).

Quuxplusone commented 5 years ago

(In reply to Chandler Carruth from comment #6)

I believe optnone is the best tool we have to shut down all interprocedural optimizations.

Adding something that only targets interprocedural cases would be challenging. It's possible we could do this, but it would need very clear and strong motivation. It would add a lot of complexity to reasoning about IR semantics in an area that is already really complex (derefinement, etc, etc).

But "optnone" does not prevent interprocedural constant propagation when LTO is used. Maybe "optnone" should also disable this optimization in LTO mode ?

Quuxplusone commented 5 years ago

(In reply to francois.de-ferriere from comment #7)

(In reply to Chandler Carruth from comment #6)

I believe optnone is the best tool we have to shut down all interprocedural optimizations.

Adding something that only targets interprocedural cases would be challenging. It's possible we could do this, but it would need very clear and strong motivation. It would add a lot of complexity to reasoning about IR semantics in an area that is already really complex (derefinement, etc, etc).

But "optnone" does not prevent interprocedural constant propagation when LTO is used. Maybe "optnone" should also disable this optimization in LTO mode ?

Could you provide an example of that? I believe optnone /should/ prevent that - /that/ is probably a bug if it doesn't.

Quuxplusone commented 5 years ago

(In reply to David Blaikie from comment #8)

(In reply to francois.de-ferriere from comment #7)

(In reply to Chandler Carruth from comment #6)

I believe optnone is the best tool we have to shut down all interprocedural optimizations.

Adding something that only targets interprocedural cases would be challenging. It's possible we could do this, but it would need very clear and strong motivation. It would add a lot of complexity to reasoning about IR semantics in an area that is already really complex (derefinement, etc, etc).

But "optnone" does not prevent interprocedural constant propagation when LTO is used. Maybe "optnone" should also disable this optimization in LTO mode ?

Could you provide an example of that? I believe optnone /should/ prevent that - /that/ is probably a bug if it doesn't.

Here is a very simple example. On this example, with or without -flto, the constant value in the callee is 'inlined'into the caller, although the caller still performs the call to the callee.

int attribute((noinline, optnone)) v1() { return 0xab; } int main() { return v1(); }