dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.94k stars 4.64k forks source link

Redundant moves with small values in some cases #102824

Open am11 opened 3 months ago

am11 commented 3 months ago

For < 32-bit values (byte and ushort), there seems to be redundant moves in codegen:

    [MethodImpl(MethodImplOptions.NoInlining)]
    public bool M1(byte first, byte second)
        => first < second;

    [MethodImpl(MethodImplOptions.NoInlining)]
    public bool M2(ushort first, ushort second)
        => first < second;

    [MethodImpl(MethodImplOptions.NoInlining)]
    public bool M3(uint first, uint second)
        => first < second;

    [MethodImpl(MethodImplOptions.NoInlining)]
    public bool M4(ulong first, ulong second)
        => first < second;

gives:

M1
         movzx    rax, sil
         movzx    rcx, dl
         cmp      eax, ecx
         setl     al
         movzx    rax, al

M2
         movzx    rax, si
         movzx    rcx, dx
         cmp      eax, ecx
         setl     al
         movzx    rax, al

M3
         cmp      esi, edx
         setb     al
         movzx    rax, al

M4
         cmp      rsi, rdx
         setb     al
         movzx    rax, al

Expected

M1
         cmp      dl, r8b
         setb     al
         movzx    eax, al

M2
         cmp      dx, r8w
         setb     al
         movzx    eax, al
dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch See info in area-owners.md if you want to be subscribed.

h3xds1nz commented 3 months ago

The codegens with zero extensions are common, remember C# doesn't have the operators defined for less than Int32, there's always been binary promotion to 32bit number, hence why you get setl on the promoted numbers.

On that note, however, what you really want is actually instead of the setb al -> movzx rax, al (which really should be movzx eax, al at least - and I've just checked, JIT does generate properly movzx eax, al for me in those cases? + on SharpLab)

xor eax, eax
<compare>
setb al

That means you'd have to use r8d f.e. for extension in this case, which are two REX prefixes sure but it would work out for the better in the case of no inlining — but that's very specific.