Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Add support for inlining through musttail thunks #22287

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR22288
Status NEW
Importance P normal
Reported by Reid Kleckner (rnk@google.com)
Reported on 2015-01-21 17:33:31 -0800
Last modified on 2018-04-04 17:40:31 -0700
Version trunk
Hardware PC Windows NT
CC codeman.consulting@gmail.com, david.majnemer@gmail.com, echristo@gmail.com, llvm-bugs@lists.llvm.org, peter@pcc.me.uk
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Missing optimization request extracted from
http://llvm.org/bugs/show_bug.cgi?id=20944#c10 :

In the future, we should teach the inliner how to inline through thunks like
this:

define double @f(i8* %a, i32 %b, i32 %c) {
entry:
  %call = call x86_thiscallcc double bitcast (void (i8*, ...)* @thunk to double (i8*, i32, i32)*)(i8* %a, i32 %b, i32 %c)
  ret double %call
}
define linkonce_odr x86_thiscallcc void @thunk(i8* %this, ...) #0  {
entry:
  %0 = bitcast i8* %this to void (i8*, ...)**
  %1 = load void (i8*, ...)** %0
  musttail call x86_thiscallcc void (i8*, ...)* %1(i8* %this, ...)
  ret void
}
attributes #0 = { "thunk" }

--- New IR ---

define double @f(i8* %a, i32 %b, i32 %c) {
entry:
  %0 = bitcast i8* %a to void (i8*, ...)**
  %1 = load void (i8*, ...)** %0
  %2 = bitcast void (i8*, ...)* %1 to double (i8*, i32, i32)* ; new bitcast
  %3 = tail call double %2(i8* %a, i32 %b, i32 %c) ; fill in missing parameters and use the result of the expected type
  ret double %3
}

Basically, push the bitcast on the function prototype through to the musttail
call sites. The musttail call site will always have a prototype matching the
thunk, so this shouldn't require instcombine-like bitcast insertion logic, it's
just a bitcast of the function prototype. Follow-on optimizations can clean up
the cast.
Quuxplusone commented 7 years ago

I might like to take a shot at this one, it's been a while since I wrote any pass code and I understand the construction methods for that type of structure well. Should I just assign myself the bug?

Quuxplusone commented 7 years ago

Go for it. :) If you need help, I'm on IRC and others may be able to help.

Quuxplusone commented 6 years ago

Apologies for the AFK, had the nasty flu that's been going around, holidays, etc. I'll be looking into this more if it hasn't been resolved in the meantime.

Quuxplusone commented 6 years ago

There was some recent progress on inlining variadic functions that don't call va_start. It's almost what we need, we just need to treat "thunks" specially.

Quuxplusone commented 6 years ago
Here's an alternative way to phrase the missed optimization with less pointer-y
code:

define i32 @call_thunk(i32 %x, i32 %y) {
  %r = call i32 (i32, i32) bitcast (void (i32, ...)* @inc_first_arg_thunk to i32 (i32, i32)*)(i32 %x, i32 %y)
  ret i32 %r
}

define internal void @inc_first_arg_thunk(i32 %arg1, ...) #0 {
entry:
  %inc = add i32 1, %arg1
  musttail call void (i32, ...) bitcast (i32 (i32,i32)* @plus to void (i32, ...)*)(i32 %inc, ...)
  ret void
}

define internal i32 @plus(i32 %x, i32 %y) {
  %r = add i32 %x, %y
  ret i32 %r
}

attributes #0 = { "thunk" }

Inlining should produce:

define i32 @call_thunk(i32 %x, i32 %y) {
  %x1 = add i32 %x, 1
  %r = add i32 %x1, %y
  ret i32 %r
}