llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.14k stars 12.02k forks source link

`__attribute((noinline))` not respected #45808

Open jrmuizel opened 4 years ago

jrmuizel commented 4 years ago
Bugzilla Link 46463
Version trunk
OS All
CC @DimitryAndric,@ramosian-glider,@zmodem,@MaskRay,@jdoerfert,@LebedevRI,@nickdesaulniers,@nikic,@smithp35,@stephenhines

Extended Description

__attribute((noinline))
char *demo(char *s) {
    return s;
}

char *foo()  {
    return demo("input string");
}

compiles to:

demo(char*):                              # @demo(char*)
        movq    %rdi, %rax
        retq
foo():                                # @foo()
        movl    $.L.str, %eax
        retq
.L.str:
        .asciz  "input string"
zmodem commented 4 years ago

Okay, let's not block on it then.

jdoerfert commented 4 years ago

Nick flagged this as a potential release blocker a while ago. Are there still concerns here?

Fixing this would require adding new noipa attribute. I'm not sure this can be viewed as a blocker.

Agreed. And the behavior did not conceptually change since 3.0 (https://godbolt.org/z/rix62q)

LebedevRI commented 4 years ago

Nick flagged this as a potential release blocker a while ago. Are there still concerns here?

Fixing this would require adding new noipa attribute. I'm not sure this can be viewed as a blocker.

zmodem commented 4 years ago

Nick flagged this as a potential release blocker a while ago. Are there still concerns here?

jdoerfert commented 4 years ago

This kind of thing is going to get Clang banned from use in the Linux kernel

Sorry, that was hyperbolic. I had lunch and a cup of coffee and now I less hangry and fully caffeinated.

No worries. I also do not want this to happen. The idea would have been that clang adds "something extra" if it compiles noinline in "kernel mode". That is still an option FWIW.

That said, none of our alarm bells have gone off about anything being broken from this change. I would just hate for something to be found broken due to this further down the road.

Short of having a second attribute in the kernel source I strongly advocate for a "kernel mode/flag" that adds the second attribute in IR.

Similarly that there are cases where attribute((always_inline)) doesn't mean "always inline," I guess we'll have cases to point people to where attribute((noinline)) doesn't mean "no inline."

The first is true, the second is not. noinline did and does mean "no inline". I remember we discussed a new attribute to address the former, as they were again two camps, those who wanted it to mean always and those who wanted the current behavior. noinline is the same. It cannot mean two things. It always did mean "no inline" but if people also want "no inter-procedural stuff" we need a new IR attribute and probably a source one too.

I much prefer people having function attributes that provide some actual semantics rather than empty asm statements though.

+2 yes, and that is why I emphasize we add new function attributes (on IR level) if there is a need for them. noipa seems like a uncontroversial candidate to be added.

nickdesaulniers commented 4 years ago

This kind of thing is going to get Clang banned from use in the Linux kernel

Sorry, that was hyperbolic. I had lunch and a cup of coffee and now I less hangry and fully caffeinated.

That said, none of our alarm bells have gone off about anything being broken from this change. I would just hate for something to be found broken due to this further down the road.

Similarly that there are cases where __attribute__((always_inline)) doesn't mean "always inline," I guess we'll have cases to point people to where __attribute__((noinline)) doesn't mean "no inline."

I much prefer people having function attributes that provide some actual semantics rather than empty asm statements though.

jdoerfert commented 4 years ago

Hm reading jdoerfert's comments on D75815 and the surrounding text, it seems that indeed the 'noipa' attribute is more likely covering this use case. But TBH I had not heard of this attribute before, maybe it is not that well-known?

At least gcc also supports it, but in their documentation they mention "This attribute is supported mainly for the purpose of testing the compiler". So no idea how production-ready it is.

As far as I know, LLVM doesn't have it (yet). We could/should introduce it though.

jdoerfert commented 4 years ago

We need the semantics of noinline to mean "explicit call" instruction. > Messing with that is going to cause all kinds of unexpected bugs.

The thing is, that was never the case: https://godbolt.org/z/rix62q If you want that, we need a different attribute (in IR).

I understand people want to be compatible with GCC or have certain expectations. However, we should not solve this by inconsistent descriptions/interpretations of the IR. As shown above, we always removed calls to noinline functions even if the result was used. Not to mention if the result is unused and the function is const. That we now propagate arguments through the call is the natural evolution from this year long precedence.

In https://reviews.llvm.org/D75815#1976892 we talk about noipa for the IR, similarly, you could do optnone on the call instruction. Either way, if you want to prevent optimization across call edges, we need to provide a different/new attribute to represent that (or use optnone for the function).

From an end-user perspective, appearing to inline the function while it is marked as 'noinline' would be rather suprising. Is there any usefulness to the noinline attribute in that case at all?

noinline will, as described by GCC, prevent inlining. If we now say it also does "imply" X and Y, people that want only to prevent inlining might complain. It is also not defined that way so we play whack-a-mole with optimizations and analysis that interpret the meaning based on the definition we use (https://llvm.org/docs/LangRef.html#function-attributes). That said, if we need an attribute to describe something similar to noinline we should create one and use it.

Then again, if you need to build software that compiles with both clang and gcc, you will likely be forced to use the barriers anyway.

The (particular) asm barrier might or might not prevent us from doing the IPO. For now it would probably work.

DimitryAndric commented 4 years ago

Hm reading jdoerfert's comments on D75815 and the surrounding text, it seems that indeed the 'noipa' attribute is more likely covering this use case. But TBH I had not heard of this attribute before, maybe it is not that well-known?

At least gcc also supports it, but in their documentation they mention "This attribute is supported mainly for the purpose of testing the compiler". So no idea how production-ready it is.

nickdesaulniers commented 4 years ago

This kind of thing is going to get Clang banned from use in the Linux kernel: https://godbolt.org/z/M2g4ZJ

We need the semantics of noinline to mean "explicit call" instruction. Messing with that is going to cause all kinds of unexpected bugs.

For example, noinline is used extensively in the kernel to limit stack usage, as the kernel only has 2 pages worth of stack to work with.

The linux kernel has 1696 uses of noinline today. I'm sure a vast majority of them have very good reasons to do so.

DimitryAndric commented 4 years ago

they also specify a workaround to get the desired effect, which is (at least from my point of view) to force the function to always be called, and never inlined

What I meant by this was that sometimes end-users can want this, for various reasons. And instead of using the rather horrid method of inserting barriers, an official attribute specifying this (reallydontinline?) would be nicer. Then again, if you need to build software that compiles with both clang and gcc, you will likely be forced to use the barriers anyway.

DimitryAndric commented 4 years ago

From an end-user perspective, appearing to inline the function while it is marked as 'noinline' would be rather suprising. Is there any usefulness to the noinline attribute in that case at all?

That said, looking at gcc's documentation here: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noinline-function-attribute

they are saying:

This function attribute prevents a function from being considered for inlining. If the function does not have side effects, there are optimizations other than inlining that cause function calls to be optimized away, although the function call is live.

Unfortunately they don't really say what those other optimizations are, but they also specify a workaround to get the desired effect, which is (at least from my point of view) to force the function to always be called, and never inlined:

To keep such calls from being optimized away, put

asm ("");

(see Extended Asm) in the called function, to serve as a special side effect.

jdoerfert commented 4 years ago

I'd argue there is n obug. This function is not inlined, noinline is respected.

noinline doesn't mean "block all optimizations", it only says "don't move code out of this function"

+1

See https://reviews.llvm.org/D75815#1976892


Some examples that might be interesting:

__attribute((noinline))
char *demo(char *s) {
    return s;
}

void foo()  {
    demo("input string");
}

would you argue there must be a call to demo in foo? I would say no.

__attribute((noinline))
char *demo(char *s) {
    return NULL;
}

void foo()  {
    return demo("input string");
}

I would again not expect a call.

(Without testing it, I assume both of the above should have been optimized before the returned argument patch.)

nickdesaulniers commented 4 years ago

Sounds like issues related to noinline were raised there: https://reviews.llvm.org/D75815#1941004

DimitryAndric commented 4 years ago

Looked again, and it is definitely regressed with https://reviews.llvm.org/rG03727687766a ("[InstCombine] Simplify calls with "returned" attribute") by Nikita Popov.

With clang llvmorg-11-init-06295-g9cf920e64d1:

    .text
    .file   "pr46463.c"
    .globl  demo                    # -- Begin function demo
    .p2align    4, 0x90
    .type   demo,@function
demo:                                   # @demo
.Ldemo$local:
# %bb.0:                                # %entry
    movq    %rdi, %rax
    retq
.Lfunc_end0:
    .size   demo, .Lfunc_end0-demo
                                        # -- End function
    .globl  foo                     # -- Begin function foo
    .p2align    4, 0x90
    .type   foo,@function
foo:                                    # @foo
.Lfoo$local:
# %bb.0:                                # %entry
    movl    $.L.str, %edi
    jmp .Ldemo$local            # TAILCALL
.Lfunc_end1:
    .size   foo, .Lfunc_end1-foo
                                        # -- End function
    .type   .L.str,@object          # @.str
    .section    .rodata.str1.1,"aMS",@progbits,1
.L.str:
    .asciz  "input string"
    .size   .L.str, 13

    .ident  "clang version 11.0.0 (https://github.com/llvm/llvm-project.git 9cf920e64d18e9c64706c8c8baf71a4919dcbb42)"
    .section    ".note.GNU-stack","",@progbits
    .addrsig

With clang llvmorg-11-init-06296-g03727687766:

    .text
    .file   "pr46463.c"
    .globl  demo                    # -- Begin function demo
    .p2align    4, 0x90
    .type   demo,@function
demo:                                   # @demo
.Ldemo$local:
# %bb.0:                                # %entry
    movq    %rdi, %rax
    retq
.Lfunc_end0:
    .size   demo, .Lfunc_end0-demo
                                        # -- End function
    .globl  foo                     # -- Begin function foo
    .p2align    4, 0x90
    .type   foo,@function
foo:                                    # @foo
.Lfoo$local:
# %bb.0:                                # %entry
    movl    $.L.str, %eax
    retq
.Lfunc_end1:
    .size   foo, .Lfunc_end1-foo
                                        # -- End function
    .type   .L.str,@object          # @.str
    .section    .rodata.str1.1,"aMS",@progbits,1
.L.str:
    .asciz  "input string"
    .size   .L.str, 13

    .ident  "clang version 11.0.0 (https://github.com/llvm/llvm-project.git 03727687766a72504712861bf038f0be962527d0)"
    .section    ".note.GNU-stack","",@progbits
    .addrsig
DimitryAndric commented 4 years ago

It appears to have regressed with https://reviews.llvm.org/rG8903e61b6611 ("[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects"), by Fangrui Song.

Eh sorry, this was a bisection mistake on my part. This commit introduces a .Lfoo$local label, but does not yet eliminate the tail call. Looking further.

DimitryAndric commented 4 years ago

It seems like this was a recent behaviour change as clang 10.0 doesn't show the problem.

It appears to have regressed with https://reviews.llvm.org/rG8903e61b6611 ("[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects"), by Fangrui Song.

I guess it is an unintended side effect? Whatever the cause, it would be nice to have it fixed before 11 branches.

jrmuizel commented 4 years ago

clang trunk: https://gcc.godbolt.org/z/tpJsOm It seems like this was a recent behaviour change as clang 10.0 doesn't show the problem.

DimitryAndric commented 4 years ago

Which version of clang is this? With the latest 10.0.1 off the branch:

$ clang -fno-asynchronous-unwind-tables -fomit-frame-pointer -O2 -S pr46463.c -o -
        .text
        .file   "pr46463.c"
        .globl  demo                    # -- Begin function demo
        .p2align        4, 0x90
        .type   demo,@function
demo:                                   # @demo
# %bb.0:                                # %entry
        movq    %rdi, %rax
        retq
.Lfunc_end0:
        .size   demo, .Lfunc_end0-demo
                                        # -- End function
        .globl  foo                     # -- Begin function foo
        .p2align        4, 0x90
        .type   foo,@function
foo:                                    # @foo
# %bb.0:                                # %entry
        movl    $.L.str, %edi
        jmp     demo                    # TAILCALL
.Lfunc_end1:
        .size   foo, .Lfunc_end1-foo
                                        # -- End function
        .type   .L.str,@object          # @.str
        .section        .rodata.str1.1,"aMS",@progbits,1
.L.str:
        .asciz  "input string"
        .size   .L.str, 13

        .ident  "FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.0-97-g6f71678ecd2)"
        .section        ".note.GNU-stack","",@progbits
        .addrsig

So this tail-calls the demo function.

LebedevRI commented 4 years ago

noinline doesn't mean "block all optimizations", it only says "don't move code out of this function"