dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.99k stars 4.66k forks source link

Inliner: don't give up on virtual calls if we can clarify their exact targets after inlining #85209

Open EgorBo opened 1 year ago

EgorBo commented 1 year ago
class ClassA
{
    [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public virtual void SayHello() => Console.WriteLine("Hello");
}

sealed class ClassB : ClassA {}

// Signature is abstract, but it always returns an exact class
ClassA GetClassA() => new ClassB();

void Test()
{
    GetClassA().SayHello(); // will SayHello() be inlined here?
}

Current codegen for Test():

; Assembly listing for method Prog:Test():this
       sub      rsp, 40
       mov      rcx, 0x7FF8DBB2FD60      ; Prog+ClassB
       call     CORINFO_HELP_NEWSFAST
       mov      rcx, rax
       call     [Prog+ClassA:SayHello():this]
       nop      
       add      rsp, 40
       ret      
; Total bytes of code 34

SayHello is not inlined despite being devirtualized and marked as AggressiveInlining. This demonstrates a fundamental problem in the JIT's inliner - it does inlining in, roughly, two stages: 1) Traverse all calls and set inline candidates + spill candidates to be top-level statements. 2) Inline all candidates we previosly found. During the first stage we give up on SayHello because the target is abstract (ClassA) - we don't know yet that if we inline GetClassA we'll get the exact target (ClassB). What is funny that Dynamic PGO can help us here - it will try to inline it as ClassB under a guard (GDV) and then we'll fold the guard once we clarify the target (https://github.com/dotnet/runtime/pull/84661), so the same codegen when PGO is enabled looks like this:

; Assembly listing for method Prog:Test():this
       sub      rsp, 40
       mov      rcx, 0x23780209218      ; 'Hello'
       call     [System.Console:WriteLine(System.String)]
       nop      
       add      rsp, 40
       ret      
; Total bytes of code 26

No guards, call is inlined!

We should be able to do the same without PGO too (e.g. for NativeAOT). This limitation is the exact reason why APIs like Encoding.UTF8.X, Encoding.ASCII.X, ArrayPool.Shared.X etc are never inlined without PGO.

cc @AndyAyersMS (we discussed this case today in Discord)

ghost commented 1 year ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

Issue Details
```csharp class ClassA { [MethodImpl(MethodImplOptions.AggressiveInlining)] public virtual void SayHello() => Console.WriteLine("Hello"); } sealed class ClassB : ClassA {} // Signature is abstract, but it always returns an exact class ClassA GetClassA() => new ClassB(); void Test() { GetClassA().SayHello(); // will SayHello() be inlined here? } ``` Current codegen for `Test()`: ```asm ; Assembly listing for method Prog:Test():this sub rsp, 40 mov rcx, 0x7FF8DBB2FD60 ; Prog+ClassB call CORINFO_HELP_NEWSFAST mov rcx, rax call [Prog+ClassA:SayHello():this] nop add rsp, 40 ret ; Total bytes of code 34 ``` `SayHello` is not inlined despite being devirtualized and marked as `AggressiveInlining`. This demonstrates a fundamental problem in the JIT's inliner - it does inlining in, roughly, two stages: 1) Traverse all calls and set inline candidates + spill candidates to be top-level statements. 2) Inline all candidates we previosly recorded. During the first stage we give up on `SayHello` because the target is abstract (`ClassA`) - we don't know yet that if we inline `GetClassA` we'll get the exact target (`ClassB`). What is funny that Dynamic PGO can help us here - it will try to inline it as `ClassB` under a guard (GDV) and then we'll fold the guard once we clarify the target (https://github.com/dotnet/runtime/pull/84661), so the same codegen when PGO is enabled looks like this: ```asm ; Assembly listing for method Prog:Test():this sub rsp, 40 mov rcx, 0x23780209218 ; 'Hello' call [System.Console:WriteLine(System.String)] nop add rsp, 40 ret ; Total bytes of code 26 ``` No guards, call is inlined! We should be able to do the same without PGO too (e.g. for NativeAOT). This limitation is the exact reason why APIs like `Encoding.UTF8.X`, `Encoding.ASCII.X`, `ArrayPool.Shared.X` etc are never inlined without PGO. cc @AndyAyersMS
Author: EgorBo
Assignees: -
Labels: `area-CodeGen-coreclr`
Milestone: -
MichalPetryka commented 1 year ago

This also affects #85197.