Open Quuxplusone opened 4 years ago
Bugzilla Link | PR46463 |
Status | CONFIRMED |
Importance | P enhancement |
Reported by | Jeff Muizelaar (jmuizelaar@mozilla.com) |
Reported on | 2020-06-25 19:04:54 -0700 |
Last modified on | 2021-11-09 02:55:00 -0800 |
Version | trunk |
Hardware | PC All |
CC | dimitry@andric.com, glider@google.com, hans@chromium.org, i@maskray.me, jdoerfert@anl.gov, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, ndesaulniers@google.com, nikita.ppv@gmail.com, sfertile@ca.ibm.com, smithp352@googlemail.com, srhines@google.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR26545 |
noinline doesn't mean "block all optimizations", it only says "don't move code out of this function"
Which version of clang is this? With the latest 10.0.1 off the branch:
$ clang -fno-asynchronous-unwind-tables -fomit-frame-pointer -O2 -S pr46463.c -
o -
.text
.file "pr46463.c"
.globl demo # -- Begin function demo
.p2align 4, 0x90
.type demo,@function
demo: # @demo
# %bb.0: # %entry
movq %rdi, %rax
retq
.Lfunc_end0:
.size demo, .Lfunc_end0-demo
# -- End function
.globl foo # -- Begin function foo
.p2align 4, 0x90
.type foo,@function
foo: # @foo
# %bb.0: # %entry
movl $.L.str, %edi
jmp demo # TAILCALL
.Lfunc_end1:
.size foo, .Lfunc_end1-foo
# -- End function
.type .L.str,@object # @.str
.section .rodata.str1.1,"aMS",@progbits,1
.L.str:
.asciz "input string"
.size .L.str, 13
.ident "FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.0-97-g6f71678ecd2)"
.section ".note.GNU-stack","",@progbits
.addrsig
So this tail-calls the demo function.
clang trunk: https://gcc.godbolt.org/z/tpJsOm
It seems like this was a recent behaviour change as clang 10.0 doesn't show the
problem.
(In reply to Jeff Muizelaar from comment #3)
> It seems like this was a recent behaviour change as clang 10.0 doesn't show
> the problem.
It appears to have regressed with https://reviews.llvm.org/rG8903e61b6611
("[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects"), by
Fangrui Song.
I guess it is an unintended side effect? Whatever the cause, it would be nice
to have it fixed before 11 branches.
(In reply to Dimitry Andric from comment #4)
> It appears to have regressed with https://reviews.llvm.org/rG8903e61b6611
> ("[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects"),
> by Fangrui Song.
Eh sorry, this was a bisection mistake on my part. This commit introduces a
.Lfoo$local label, but does not yet eliminate the tail call. Looking further.
Looked again, and it is definitely regressed with
https://reviews.llvm.org/rG03727687766a ("[InstCombine] Simplify calls with
"returned" attribute") by Nikita Popov.
With clang llvmorg-11-init-06295-g9cf920e64d1:
========================================================================
.text
.file "pr46463.c"
.globl demo # -- Begin function demo
.p2align 4, 0x90
.type demo,@function
demo: # @demo
.Ldemo$local:
# %bb.0: # %entry
movq %rdi, %rax
retq
.Lfunc_end0:
.size demo, .Lfunc_end0-demo
# -- End function
.globl foo # -- Begin function foo
.p2align 4, 0x90
.type foo,@function
foo: # @foo
.Lfoo$local:
# %bb.0: # %entry
movl $.L.str, %edi
jmp .Ldemo$local # TAILCALL
.Lfunc_end1:
.size foo, .Lfunc_end1-foo
# -- End function
.type .L.str,@object # @.str
.section .rodata.str1.1,"aMS",@progbits,1
.L.str:
.asciz "input string"
.size .L.str, 13
.ident "clang version 11.0.0 (https://github.com/llvm/llvm-project.git
9cf920e64d18e9c64706c8c8baf71a4919dcbb42)"
.section ".note.GNU-stack","",@progbits
.addrsig
========================================================================
With clang llvmorg-11-init-06296-g03727687766:
========================================================================
.text
.file "pr46463.c"
.globl demo # -- Begin function demo
.p2align 4, 0x90
.type demo,@function
demo: # @demo
.Ldemo$local:
# %bb.0: # %entry
movq %rdi, %rax
retq
.Lfunc_end0:
.size demo, .Lfunc_end0-demo
# -- End function
.globl foo # -- Begin function foo
.p2align 4, 0x90
.type foo,@function
foo: # @foo
.Lfoo$local:
# %bb.0: # %entry
movl $.L.str, %eax
retq
.Lfunc_end1:
.size foo, .Lfunc_end1-foo
# -- End function
.type .L.str,@object # @.str
.section .rodata.str1.1,"aMS",@progbits,1
.L.str:
.asciz "input string"
.size .L.str, 13
.ident "clang version 11.0.0 (https://github.com/llvm/llvm-project.git
03727687766a72504712861bf038f0be962527d0)"
.section ".note.GNU-stack","",@progbits
.addrsig
========================================================================
Sounds like issues related to noinline were raised there: https://reviews.llvm.org/D75815#1941004
I'd argue there is n obug. This function is not inlined, noinline
is respected.
(In reply to Roman Lebedev from comment #1)
noinline doesn't mean "block all optimizations", it only says "don't move code out of this function"
+1
See https://reviews.llvm.org/D75815#1976892
Some examples that might be interesting:
__attribute((noinline))
char *demo(char *s) {
return s;
}
void foo() {
demo("input string");
}
would you argue there must be a call to demo in foo? I would say no.
__attribute((noinline))
char *demo(char *s) {
return NULL;
}
void foo() {
return demo("input string");
}
I would again not expect a call.
(Without testing it, I assume both of the above should have been optimized before the returned
argument patch.)
From an end-user perspective, appearing to inline the function while it is
marked as 'noinline' would be rather suprising. Is there any usefulness to the
noinline attribute in that case at all?
That said, looking at gcc's documentation here:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noinline-function-attribute
they are saying:
> This function attribute prevents a function from being considered for
> inlining. If the function does not have side effects, there are
> optimizations other than inlining that cause function calls to be
> optimized away, although the function call is live.
Unfortunately they don't really say what those other optimizations are, but
they also specify a workaround to get the desired effect, which is (at least
from my point of view) to force the function to always be called, and never
inlined:
> To keep such calls from being optimized away, put
>
> asm ("");
>
> (see Extended Asm) in the called function, to serve as a special side effect.
(In reply to Dimitry Andric from comment #9)
> they also specify a workaround to get the desired effect, which is (at least
> from my point of view) to force the function to always be called, and never
> inlined
What I meant by this was that sometimes end-users can want this, for various
reasons. And instead of using the rather horrid method of inserting barriers,
an official attribute specifying this (reallydontinline?) would be nicer. Then
again, if you need to build software that compiles with both clang and gcc, you
will likely be forced to use the barriers anyway.
This kind of thing is going to get Clang banned from use in the Linux kernel:
https://godbolt.org/z/M2g4ZJ
We ***need*** the semantics of noinline to mean "explicit call" instruction.
Messing with that is going to cause all kinds of unexpected bugs.
For example, noinline is used extensively in the kernel to limit stack usage,
as the kernel only has 2 pages worth of stack to work with.
The linux kernel has 1696 uses of noinline today. I'm sure a vast majority of
them have very good reasons to do so.
Hm reading jdoerfert's comments on D75815 and the surrounding text, it seems that indeed the 'noipa' attribute is more likely covering this use case. But TBH I had not heard of this attribute before, maybe it is not that well-known?
At least gcc also supports it, but in their documentation they mention "This attribute is supported mainly for the purpose of testing the compiler". So no idea how production-ready it is.
We need the semantics of noinline to mean "explicit call" instruction. > Messing with that is going to cause all kinds of unexpected bugs.
The thing is, that was never the case: https://godbolt.org/z/rix62q If you want that, we need a different attribute (in IR).
I understand people want to be compatible with GCC or have certain expectations. However, we should not solve this by inconsistent descriptions/interpretations of the IR. As shown above, we always removed calls to noinline
functions even if the result was used. Not to mention if the result is unused and the function is const
.
That we now propagate arguments through the call is the natural evolution from this year long precedence.
In https://reviews.llvm.org/D75815#1976892 we talk about noipa
for the IR, similarly, you could do optnone
on the call instruction. Either way, if you want to prevent optimization across call edges, we need to provide a different/new attribute to represent that (or use optnone
for the function).
From an end-user perspective, appearing to inline the function while it is marked as 'noinline' would be rather suprising. Is there any usefulness to the noinline attribute in that case at all?
noinline
will, as described by GCC, prevent inlining. If we now say it also does "imply" X and Y, people that want only to prevent inlining might complain. It is also not defined that way so we play whack-a-mole with optimizations and analysis that interpret the meaning based on the definition we use (https://llvm.org/docs/LangRef.html#function-attributes).
That said, if we need an attribute to describe something similar to noinline
we should create one and use it.
Then again, if you need to build software that compiles with both clang and gcc, you will likely be forced to use the barriers anyway.
The (particular) asm barrier might or might not prevent us from doing the IPO. For now it would probably work.
(In reply to Dimitry Andric from comment #12)
> Hm reading jdoerfert's comments on D75815 and the surrounding text, it seems
> that indeed the 'noipa' attribute is more likely covering this use case. But
> TBH I had not heard of this attribute before, maybe it is not that
> well-known?
>
> At least gcc also supports it, but in their documentation they mention "This
> attribute is supported mainly for the purpose of testing the compiler". So
> no idea how production-ready it is.
As far as I know, LLVM doesn't have it (yet). We could/should introduce it
though.
(In reply to Nick Desaulniers from comment #11)
> This kind of thing is going to get Clang banned from use in the Linux
> kernel
Sorry, that was hyperbolic. I had lunch and a cup of coffee and now I less
hangry and fully caffeinated.
That said, none of our alarm bells have gone off about anything being broken
from this change. I would just hate for something to be found broken due to
this further down the road.
Similarly that there are cases where __attribute__((always_inline)) doesn't
mean "always inline," I guess we'll have cases to point people to where
__attribute__((noinline)) doesn't mean "no inline."
I much prefer people having function attributes that provide some actual
semantics rather than empty asm statements though.
(In reply to Nick Desaulniers from comment #15)
(In reply to Nick Desaulniers from comment #11)
This kind of thing is going to get Clang banned from use in the Linux kernel
Sorry, that was hyperbolic. I had lunch and a cup of coffee and now I less hangry and fully caffeinated.
No worries. I also do not want this to happen. The idea would have been that clang adds "something extra" if it compiles noinline
in "kernel mode". That is still an option FWIW.
That said, none of our alarm bells have gone off about anything being broken from this change. I would just hate for something to be found broken due to this further down the road.
Short of having a second attribute in the kernel source I strongly advocate for a "kernel mode/flag" that adds the second attribute in IR.
Similarly that there are cases where attribute((always_inline)) doesn't mean "always inline," I guess we'll have cases to point people to where attribute((noinline)) doesn't mean "no inline."
The first is true, the second is not. noinline
did and does mean "no inline".
I remember we discussed a new attribute to address the former, as they were again two camps, those who wanted it to mean always and those who wanted the current behavior. noinline
is the same. It cannot mean two things. It always did mean "no inline" but if people also want "no inter-procedural stuff" we need a new IR attribute and probably a source one too.
I much prefer people having function attributes that provide some actual semantics rather than empty asm statements though.
+2 yes, and that is why I emphasize we add new function attributes (on IR level) if there is a need for them. noipa
seems like a uncontroversial candidate to be added.
Nick flagged this as a potential release blocker a while ago. Are there still concerns here?
(In reply to Hans Wennborg from comment #17)
> Nick flagged this as a potential release blocker a while ago. Are there
> still concerns here?
Fixing this would require adding new noipa attribute.
I'm not sure this can be viewed as a blocker.
(In reply to Roman Lebedev from comment #18)
> (In reply to Hans Wennborg from comment #17)
> > Nick flagged this as a potential release blocker a while ago. Are there
> > still concerns here?
>
> Fixing this would require adding new noipa attribute.
> I'm not sure this can be viewed as a blocker.
Agreed. And the behavior did not conceptually change since 3.0
(https://godbolt.org/z/rix62q)
Okay, let's not block on it then.