dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.95k stars 4.65k forks source link

Suboptimal codegen for DateTime.IsLeapYear #13094

Open EgorBo opened 5 years ago

EgorBo commented 5 years ago

I was looking for opportunities to improve performance of DateTime.UtcNow and noticed that JIT doesn't use magic tricks/peephole optimizations to improve codegen for this:

    bool IsLeapYear(int year)
    {
        if (year > 0 && year < 9999)
            return year % 4 == 0 && (year % 100 != 0 || year % 400 == 0);

        return false; // exception
    }

Currently generates (tier1):

; Method Program:IsLeapYear(int):bool:this

G_M62763_IG01:
       mov      ecx, edx

G_M62763_IG02:
       test     ecx, ecx
       jle      SHORT G_M62763_IG08
       cmp      ecx, 0x270F
       jge      SHORT G_M62763_IG08
       mov      eax, ecx
       sar      eax, 31
       and      eax, 3
       add      eax, ecx
       and      eax, -4
       mov      edx, ecx
       sub      edx, eax
       jne      SHORT G_M62763_IG06
       mov      edx, 0xD1FFAB1E
       mov      eax, edx
       imul     edx:eax, ecx
       mov      eax, edx
       shr      eax, 31
       sar      edx, 5
       add      eax, edx
       imul     eax, eax, 100
       mov      edx, ecx
       sub      edx, eax
       jne      SHORT G_M62763_IG04
       mov      edx, 0xD1FFAB1E
       mov      eax, edx
       imul     edx:eax, ecx
       mov      eax, edx
       shr      eax, 31
       sar      edx, 7
       add      eax, edx
       imul     eax, eax, 400
       sub      ecx, eax
       sete     al
       movzx    rax, al

G_M62763_IG03:
       ret      

G_M62763_IG04:
       mov      eax, 1

G_M62763_IG05:
       ret      

G_M62763_IG06:
       xor      eax, eax

G_M62763_IG07:
       ret      

G_M62763_IG08:
       xor      eax, eax

G_M62763_IG09:
       ret      
; Total bytes of code: 107

And here is the difference between what GCC generates and RyuJIT: image

See godbolt.

category:cq theme:basic-cq skill-level:expert cost:medium impact:medium

EgorBo commented 5 years ago

Btw, RyuJIT is able to optimize X % C == 0 to X & (C-1) == 0 only for unsigned X (should work for signed types too).

RussKeldorph commented 5 years ago

@dotnet/jit-contrib

EgorBo commented 5 years ago

Partly addressed in https://github.com/dotnet/coreclr/pull/25744

mikedn commented 5 years ago

Interesting, I don't remember seeing the x % 100 == 0 transform before. I guess it's a variation on magic division.

EgorBo commented 4 years ago

Interesting, I don't remember seeing the x % 100 == 0 transform before. I guess it's a variation on magic division.

I have a demo for JIT: https://github.com/EgorBo/runtime-1/commit/705d3cb1d9bf2c5d2ead7487c1a6104e1a4d6bc8

it's a straight port of https://reviews.llvm.org/D65366