llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.47k stars 11.77k forks source link

Too strong validity checks for [[clang::musttail]] #54964

Open davidbolvansky opened 2 years ago

davidbolvansky commented 2 years ago
int base(int);

int ok1() { return base(5); }
int ok2(int x, int y /* unused */) { return base(x); }
ok1():                                # @ok1()
        mov     edi, 5
        jmp     base(int)@PLT                    # TAILCALL
ok2(int, int):                               # @ok2(int, int)
        jmp     base(int)@PLT                    # TAILCALL

So clearly LLVM can apply tail call optimization for these cases. But Clang does not think so and emits errors for these cases:

int f1() { [[clang::musttail]] return base(5); }
int f2(int x, int y /* unused */) { [[clang::musttail]] return base(5); }

error: cannot perform a tail call to function 'base' because its signature is incompatible with the calling function int f1() { [[clang::musttail]] return base(5); } ^ error: cannot perform a tail call to function 'base' because its signature is incompatible with the calling function int f2(int x, int y / unused /) { [[clang::musttail]] return base(5); }

https://godbolt.org/z/7h6h1YjWT

davidbolvansky commented 2 years ago

cc @haberman cc @efriedma-quic

Maybe Clang should just drop signature check?

efriedma-quic commented 2 years ago

In general, we need to restrict the cases where we musttail: C calling convention rules limit the stack space available for tail calls, so we can't tail-call in the general case using the normal C calling convention. To ensure we have enough stack space, and that we can satisfy any other unusual conditions, we check that the signature strictly matches. This is why both clang and LLVM IR restrict musttail to those cases.

I guess there are a few things we could do:

  1. Expose tailcc in clang, and relax the [[musttail]] rules for calls to such functions.
  2. Try to come up with some rule that covers your specific case: allow tail-calling a function where the called function has fewer arguments, but the signature otherwise matches, or something like that. That probably works for most targets. There might be edge cases here I'm not thinking about.
  3. Add some sort of [[clang::nonportable_musttail]] that makes the check to do some target-specific computation.
haberman commented 2 years ago

Yes, in general the rules around musttail are intended to provide assurances that the tail call can be optimized on all targets, not just one.

When I implemented musttail for Clang, I followed the existing rules that were previously established for LLVM IR. The rules would need to be loosed at both the Clang and LLVM levels (or we could expose tailcc as suggested above).

Note that even the existing musttail rules are not stringent enough to support all backends: https://github.com/llvm/llvm-project/issues/50758

Loosening the rules would probably create more cases where we see these kinds of crashes. In other words, the existing attribute is already not fully portable.

davidbolvansky commented 2 years ago

musttail are intended to provide assurances that the tail call can be optimized on all targets

but there is no such ‘promise’ in https://clang.llvm.org/docs/AttributeReference.html and users may not need this additional assurance, which sadly, restricts this feature.

Maybe we could have musttail and portable_musttail or introduce optional argument to musttail like ´nonportable’.

I linked above link to CTRE project where they would like to use this feature but they cannot (and they consider current behaviour rather strange, and I agree).

davidbolvansky commented 2 years ago

In other words, the existing attribute is already not fully portable.

This is right - best effort feature, but no strict promises, so some obvious cases could be diagnosed by Clang and anything else by backend IMHO.

jacobsa commented 1 year ago

Here is another example where I suspect the check is too stringent in a similar way as the original example:

struct Foo {
  __attribute__((noinline)) int foo(int) { return 0; }
};

struct Bar {
  Foo foo;

  int bar(int);
};

int Bar::bar(const int x) {
  return foo.foo(x);
}

A tail call works just fine (compiler explorer) on the targets I've checked. For example, x86-64:

Bar::bar(int):                          # @Bar::bar(int)
        jmp     Foo::foo(int)                   # TAILCALL

But adding [[clang::musttail]] results in an error. As far as I know all of the discussion above applies to this too, so just noting it to use as a potential test case whenever an outcome is decided.

Personally I would like the semantics "give me an error if this can't be a tail call on this target", not the semantics "give me an error if this can't be a tail call on all targets clang supports". I suspect most people would be looking for the former, since most people care about nearly none of the supported targets by volume.

davidbolvansky commented 1 year ago

Personally I would like the semantics "give me an error if this can't be a tail call on this target", not the semantics "give me an error if this can't be a tail call on all targets clang supports".

Strong +1, I would love such behaviour as well! So probably we need to come up with new attribute which would implement such behaviour.

davidbolvansky commented 1 year ago

Proposal for [[clang::nonportable_musttail]] https://reviews.llvm.org/D147714

Alcaro commented 1 year ago

I'm just some random guy, so feel free to ignore me, but I would like to see some more tests for that attribute.

In particular, I want some tests that it properly errors out if asked for something impossible, and doesn't (for example) ICE, as in haberman's example https://bugs.llvm.org/show_bug.cgi?id=51416.

int a(int a1, int a2);
int b(int a1)
{
[[clang::nonportable_musttail]]
return a(1, a1); // legal, all args are in regs
}

int c(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9, int a10);
int d(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9)
{
[[clang::nonportable_musttail]]
return c(a1, a2, a3, a4, a5, a6, a7, a8, a9, 1); // expected-error {{can't push an extra argument onto a stack frame that's not ours}}
}

int e(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9, int a10)
{
[[clang::nonportable_musttail]]
return c(1, a1, 3, a4, a5, a6, a7, a8, a9, a10); // legal, only register arguments are changed
}

int f(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9, int a10)
{
[[clang::nonportable_musttail]]
return c(a1, a2, a3, a4, a5, a6, a7, 1, 2, 3); // legal, but Clang does currently not perform that optimization
}

int g(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9, int a10)
{
[[clang::nonportable_musttail]]
return a(1, a2); // legal, x64 sysv abi is caller pop (if there are stack args at all)
}

int h(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8, int a9, int a10, int a11)
{
[[clang::nonportable_musttail]]
return c(a1, a2, a3, a4, a5, a6, a7, a8, a9, a10); // legal, same reason as g
}

short i();
int j()
{
[[clang::nonportable_musttail]]
return i(); // expected-error {{upper bits of a int16 return value are garbage on x64 sysv abi, this return must sign extend}}
}

int k();
short l()
{
[[clang::nonportable_musttail]]
return k(); // legal, but Clang does currently not perform that optimization
}

The legal-but-unimplemented ones can be marked expected-error for now with a fixme comment, but I would like to see them in the tests.

davidbolvansky commented 1 year ago

Sounds like tests for LLVM backends.

AaronBallman commented 1 year ago

Sounds like tests for LLVM backends.

Not if it's going to be generating diagnostics, right?

nickdesaulniers commented 1 year ago

Personally I would like the semantics "give me an error if this can't be a tail call on this target", not the semantics "give me an error if this can't be a tail call on all targets clang supports". I suspect most people would be looking for the former, since most people care about nearly none of the supported targets by volume.

Right, perhaps we should change the behavior of the existing attribute to weaken it, and create a new attribute clang::musttail_all_targets or such that has stronger guarantees (and less surprising behavior).

Looking at ARMTargetLowering::IsEligibleForTailCallOptimization (#50758), I honestly don't see how the front end could ever possibly guarantee all of the specific backend requirements for musttail. The function signature check is very very weak IMO, and hinders simple obvious stuff like tail calling with a reduced parameter count. As such, I'd advocate changing the existing semantics of clang::musttail to weaken it, and allow the backend to tell you more precisely why it could not tail call for a given target. Then users would need to think about target portability only when they actually cared about target portability. WDYT?

efriedma-quic commented 1 year ago

Adding features that are target-specific in a non-obvious way is very likely to cause issues for anyone targeting an uncommon platform... writing portable C is hard enough without adding new pitfalls.

As noted at https://github.com/llvm/llvm-project/issues/54964#issuecomment-1101612886, there are solutions that allow portable code without restricting the legal signatures. Do we have some specific use-case in mind?

haberman commented 1 year ago

I want to reiterate that the existing diagnostics for clang::musttail were designed primarily to satisfy LLVM's documented requirements for the LLVM-level musttail attribute. If we weaken the checks in Clang, we will violate the LLVM invariants.

In other words, I think LLVM needs to change first. LLVM would need to relax its checks around musttail: https://github.com/llvm/llvm-project/blob/436758f7b040b77351056a69bb1b097cf7ad8834/llvm/lib/IR/Verifier.cpp#L3589-L3681

This can't be relaxed in Clang alone.

davidbolvansky commented 1 year ago

I am not sure if we can relax existing musttail, probably we would need a new call marker..

davidbolvansky commented 1 year ago

Try to come up with some rule that covers your specific case: allow tail-calling a function where the called function has fewer arguments, but the signature otherwise matches, or something like that.

I am not sure. This would be one rule, plus consider case by @jacobsa , plus X in the future. I believe people here wants best-effort musttail attribute and they are okay in diagnostics coming from individual backends.

jacobsa commented 1 year ago

Yes, to be clear what I want is "give a diagnostic unless you are able to turn this into a tail call on this platform". In other words, I want to be sure the function will not blow the stack, but I don't care (in this case) about portability. Or at least not about portability to all of the exotic platforms that clang supports.

dvmason commented 1 year ago

Coming from the function programming world, let me point out that tail-call elimination is applicable whenever the return type is compatible, and nothing needs to be done with the result other than return it. It shouldn't matter if there are more or fewer parameters or anything else.

That said, I agree with one of the comments above that many use cases would be addressed as long as there are no more parameters in the target function than in the calling function - or more accurately, if I understand the comment correctly, if no additional stack space needs to be allocated.

But right now (in Zig) I have to cast parameters in semantically-incorrect ways in order to get type signatures to match so that I can get tail-calls to work properly. That is ugly and unfortunate.