dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.98k stars 4.66k forks source link

Inconsistent reg assignment for tailcall Stub Addr Arg #10690

Closed CarolEidt closed 4 years ago

CarolEidt commented 6 years ago

fgMorphTailCall() puts REG_R11 on the node that is created by Compiler::fgGetStubAddrArg(), but in Lowering the PUTARG_REG for this node puts the value in REG_R8. This results in the following in codegen:

Generating: N071 (  3, 10) [000028] ------------        t28 =    CNS_INT(h) long   0x7ffa8aa20020 ftn REG r11 $c2
IN0010:        mov      r11, 0x7FFA8AA20020
                                                             /--*  t28    long   
Generating: N073 (???,???) [000063] ------------        t63 = *  PUTARG_REG long   REG r8
IN0011:        mov      r8, r11

It is apparently the case that r11 is the correct register, but the argEntry gets r8, which is what Lowering uses. As it happens, r11 is still valid at the call, but there is no guarantee that will be the case, and the mov is apparently unnecessary.

CarolEidt commented 6 years ago

This is for the test case JIT\Regression\JitBlue\GitHub_17585.

AndyAyersMS commented 5 years ago

@CarolEidt seems like we should fix this? If you can sketch out what is involved I'll do it.

CarolEidt commented 5 years ago

@AndyAyersMS - I'm not sure exactly where the issue arises. I would even take with a grain of salt my statement above that r11 is the correct register. I would start by looking at why the argEntry for this node is getting r8. I think it's the case that fgMorphTailCall is done prior to fgInitArgInfo and that it is probably fgInitArgInfo where the issue lies, but I'm not certain.

Based on the code here: https://github.com/dotnet/coreclr/blob/master/src/jit/morph.cpp#L2873 I would presume that it is not falling into the IsTailCallViaHelper() case (otherwise it would have used the register assigned by fgMorphTailCall), and therefore it expects that it is a regular parameter, and thinks that r8 is the right register.

@jashook may be able to shed more light on this.

jashook commented 5 years ago

On first glance I do not have much to add other than this seems like it should not be punted out of 3.0. Will try to get a jit dump quickly and see if I can give any insight.

AndyAyersMS commented 5 years ago

Linking to the original issue dotnet/runtime#10162. This is an interface tail call via helper....

; Assembly listing for method Test:CallInterfaceMethod(ref):int
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rbp based frame
; partially interruptible
; Final local variable assignments
;
;  V00 arg0         [V00,T00] (  4,  4   )     ref  ->  rsi         class-hnd
;  V01 OutArgs      [V01    ] (  1,  1   )  lclBlk (104) [rsp+0x00]   "OutgoingArgSpace"
;
; Lcl frame size = 104

G_M31664_IG01:
       55                   push     rbp
       56                   push     rsi
       4883EC68             sub      rsp, 104
       488D6C2470           lea      rbp, [rsp+70H]
       488BF1               mov      rsi, rcx

G_M31664_IG02:
       833D633F036000       cmp      dword ptr [(reloc)], 0
       7571                 jne      SHORT G_M31664_IG04

G_M31664_IG03:
       C744242001000000     mov      dword ptr [rsp+20H], 1
       C744242802000000     mov      dword ptr [rsp+28H], 2
       C744243003000000     mov      dword ptr [rsp+30H], 3
       C744243804000000     mov      dword ptr [rsp+38H], 4
       C744244005000000     mov      dword ptr [rsp+40H], 5
       C744244806000000     mov      dword ptr [rsp+48H], 6
       C744245007000000     mov      dword ptr [rsp+50H], 7
       C744245808000000     mov      dword ptr [rsp+58H], 8
       C744246009000000     mov      dword ptr [rsp+60H], 9
       448B0E               mov      r9d, dword ptr [rsi]
       4C8BCE               mov      r9, rsi
       48B9A0D09D5DFF7F0000 mov      rcx, 0x7FFF5D9DD0A0
       488B1502BFEEFF       mov      rdx, qword ptr [(reloc)]
       49BB28009E5DFF7F0000 mov      r11, 0x7FFF5D9E0028
       4D8BC3               mov      r8, r11
       E898DB3B5F           call     CORINFO_HELP_TAILCALL

G_M31664_IG04:
       E8C3143C5F           call     CORINFO_HELP_POLL_GC
       EB88                 jmp      SHORT G_M31664_IG03

; Total bytes of code 143, prolog size 11 for method Test:CallInterfaceMethod(ref):int
AndyAyersMS commented 5 years ago

When we tail call via a helper on x64, we have to shuffle the normal call arguments around, and pass the VSD arg as a normal arg. So the routing through R11 is unnecessary here.

That is, for x64, the arglists should be swizzled as follows:

;;; before fgMorphTailCall
(this, arg0, arg1, arg2, ..., argN )   VSD
  RCX,  RDX,  R8,    R9,  ...stack...   R11

;;; after fgMorphTailCall
(&copyFn, calltarget, VSD, this, arg0, arg1, ... argN)
  RCX,     RDX,        R8,  R9,   ... stack ...

I will look into removing the R11 on the stub arg in this case.

x86 handles all this differently. arm32 looks like it would have a similar issue as x64 but routing via R4. arm64 does not handle tail calls via helpers yet.