dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.9k stars 4.63k forks source link

Methods with struct parameters are not inlined #53783

Open atynagano opened 3 years ago

atynagano commented 3 years ago

Description

Based on x64 in sharplab:

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABABgAJiBGAOgCUBXAOwwEt8YaBhCfAB1YAbGFADKIgG6swMXAG4AsAChlxAEzku5AN7Ly5PfsopyAMSoAKAGaCI2DOSsBKcgF4AfGbXWnipUeMva1t7RxcPMwBmHz8A4hNTaJs7B2c3TwBZGIDDfVzyAG0MmAwACwgAEwBJfkELYrLKmr5BAHk+NggmXBoAOQgqpkFWJhGAcycAXXz48izk0LTtcgBfZRWgA=

The following:

using System.Runtime.CompilerServices;

class C {

    void F1(float f) => F2(f);
    void F2(float f) => F3(f);
    void F3(float f) => M(f);    

    [MethodImpl(MethodImplOptions.NoInlining)]
    void M(float f) { }
}

is compiled as:

; Core CLR v5.0.621.22011 on amd64

C..ctor()
    L0000: ret

C.F1(Single)
    L0000: vzeroupper
    L0003: jmp C.M(Single)

C.F2(Single)
    L0000: vzeroupper
    L0003: jmp C.M(Single)

C.F3(Single)
    L0000: vzeroupper
    L0003: jmp C.M(Single)

C.M(Single)
    L0000: ret

Methods with primitive parameters are inlined, and F1, F2, and F3 output the same code.

https://sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA0AXEBDAzgWwB8ABABgAJiBGAOgCUBXAOwwEt8YaBhCfAB1YAbGFADKIgG6swMXAG4AsAChlxAEzku5AN7Ly+8noO4MUBmAzkAYoIjYM2ygGZyAM1v3yANWyCGMOXIAXyN9UMoUayoAChs7S1cASnIAXgA+azVopMUlAwjM2I8E5PTrJ2zE3PziSKsKuM8k1IyAWUrAg3DwgG1WmAwACwgAEwBJfkFo/qHRib5BAHk+NggmXBoAOQgxpkFWJgOAc0SAXXDa8nbGkp1g5SCgA

But this sample:

using System.Runtime.CompilerServices;

class C {

    struct Float{ public float Value; }

    void F1(Float f) => F2(f);
    void F2(Float f) => F3(f);
    void F3(Float f) => M(f);    

    [MethodImpl(MethodImplOptions.NoInlining)]
    void M(Float f) { }
}

is compiled as:

; Core CLR v5.0.621.22011 on amd64

C..ctor()
    L0000: ret

C.F1(Float)
    L0000: push rax
    L0001: mov [rsp+0x18], rdx
    L0006: mov edx, [rsp+0x18]
    L000a: mov [rsp], edx
    L000d: mov edx, [rsp]
    L0010: add rsp, 8
    L0014: jmp C.M(Float)

C.F2(Float)
    L0000: push rax
    L0001: mov [rsp+0x18], rdx
    L0006: mov edx, [rsp+0x18]
    L000a: mov [rsp], edx
    L000d: mov edx, [rsp]
    L0010: add rsp, 8
    L0014: jmp C.M(Float)

C.F3(Float)
    L0000: mov [rsp+0x10], rdx
    L0005: mov edx, [rsp+0x10]
    L0009: jmp C.M(Float)

C.M(Float)
    L0000: ret

F1 and F2 are longer than F3 and seem to consume unnecessary stack. These methods are expected to work the same, so why this difference?

category:cq theme:inlining skill-level:intermediate cost:medium impact:medium

EgorBo commented 3 years ago

Should be inlined with https://github.com/dotnet/runtime/pull/52708

EgorBo commented 3 years ago

Oops, I misinterpreted the issue - it's not inlining related since all F1-F2 are inlined just fine. So it's struct-related, my guess - structs with float fields are not promotable in JIT yet.

hez2010 commented 3 years ago

I think it's https://github.com/dotnet/runtime/issues/43867