dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.37k stars 4.75k forks source link

Codegen for some bitflag checks could be optimized #13743

Open GrabYourPitchforks opened 5 years ago

GrabYourPitchforks commented 5 years ago

More info: https://github.com/dotnet/coreclr/pull/27603#discussion_r341567417

Sample code:

public static bool HasHighBitSet(uint value) {
    return (value & 0x8000_0000) != 0;
}

Actual codegen:

    L0000: test ecx, 0x80000000
    L0006: setnz al
    L0009: movzx eax, al
    L000c: ret

Replacing the first two instructions with test ecx, ecx and setl al would save 4 bytes of codegen. It's a minor savings but might help if this pattern occurs within a hot loop.

category:cq theme:basic-cq skill-level:intermediate cost:small

mikedn commented 5 years ago

Replacing the first two instructions with test ecx, ecx and setl al would save 4 bytes of codegen.

Actually this specific case should generate:

mov eax, ecx
shr eax, 31
ret

The test approach is only useful if such code is used as branch condition, then it would generate test and js/jns.

omariom commented 5 years ago

if such code is used as branch condition

The codegen of the inlined method looks good

mikedn commented 5 years ago

The codegen of the inlined method looks good

No, the point is that this could generate test reg, reg instead of test reg, imm to avoid the 32 bit immediate.

GrabYourPitchforks commented 5 years ago

The test approach is only useful if such code is used as branch condition

Yeah, this suggestion was for branch conditions specifically. I thought there was already a tracking issue for more general branchless bit manipulation of Boolean conditions?

mikedn commented 5 years ago

There definitely are other related issues - dotnet/runtime#6815 and dotnet/runtime#7088 come to mind and there may be others. I also have some code written that does some of this, gotta check what state it's in.

EgorBo commented 4 years ago

Possible fix: https://github.com/EgorBo/runtime-1/commit/6aaf47a7b8b37ecfddab04e181206e141e5a15a9 (with help of https://github.com/dotnet/runtime/pull/35627)

public static bool HasHighBitSet(uint value)
{
    return (value & 0x8000_0000) != 0;
}

new asm:

; Method CC:HasHighBitSet(int):bool
       8BC1                 mov      eax, ecx
       C1E81F               shr      eax, 31
       C3                   ret      

pmi jit-diff:

Analyzing CodeSize diffs...
Found 33 files with textual diffs.
PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -5184 (-0.01% of base)
    diff is an improvement.
Top file improvements (bytes):
       -1101 : System.Private.CoreLib.dasm (-0.02% of base)
        -976 : System.Numerics.Tensors.dasm (-0.40% of base)
        -846 : System.Security.Cryptography.Pkcs.dasm (-0.22% of base)
        -489 : System.Text.Json.dasm (-0.08% of base)
        -400 : System.Memory.dasm (-0.16% of base)
        -270 : System.Security.Cryptography.Algorithms.dasm (-0.08% of base)
        -137 : System.Reflection.Metadata.dasm (-0.03% of base)
        -129 : System.Security.Cryptography.Cng.dasm (-0.07% of base)
        -110 : System.Net.Http.dasm (-0.02% of base)
        -108 : System.Formats.Asn1.dasm (-0.11% of base)
         -99 : System.Formats.Cbor.dasm (-0.21% of base)
         -99 : System.Security.Cryptography.X509Certificates.dasm (-0.06% of base)
         -49 : System.Reflection.MetadataLoadContext.dasm (-0.03% of base)
         -47 : Microsoft.CodeAnalysis.dasm (-0.00% of base)
         -36 : System.Net.WebSockets.dasm (-0.09% of base)
         -33 : System.IO.Pipelines.dasm (-0.05% of base)
         -30 : System.Net.WebSockets.WebSocketProtocol.dasm (-0.09% of base)
         -28 : System.Private.Xml.dasm (-0.00% of base)
         -24 : ILCompiler.Reflection.ReadyToRun.dasm (-0.02% of base)
         -21 : Microsoft.Extensions.Logging.Console.dasm (-0.05% of base)
33 total files with Code Size differences (33 improved, 0 regressed), 235 unchanged.
Top method regressions (bytes):
           9 ( 0.44% of base) : System.Security.Cryptography.Pkcs.dasm - ManagedPkcsPal:Decode(ReadOnlySpan`1,byref,byref,byref,byref,byref):DecryptorPal:this
Top method improvements (bytes):
        -234 (-1.62% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:EnsureCapacity(int,int):this (7 methods)
         -99 (-1.75% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:RemoveAt(int,int):this (7 methods)
         -98 (-2.97% of base) : System.Private.CoreLib.dasm - Bucket:Rent():ref:this (7 methods)
         -98 (-3.45% of base) : System.Private.CoreLib.dasm - Bucket:Return(ref):this (7 methods)
         -64 (-0.81% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:Reshape(ReadOnlySpan`1):Tensor`1:this (7 methods)
         -63 (-1.35% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:ToDenseTensor():DenseTensor`1:this (7 methods)
         -63 (-0.66% of base) : System.Security.Cryptography.Algorithms.dasm - KeyFormatHelper:ReadEncryptedPkcs8(ref,ReadOnlyMemory`1,ReadOnlySpan`1,ReadOnlySpan`1,KeyReader`1,byref,byref) (7 methods)
         -63 (-0.66% of base) : System.Security.Cryptography.Pkcs.dasm - KeyFormatHelper:ReadEncryptedPkcs8(ref,ReadOnlyMemory`1,ReadOnlySpan`1,ReadOnlySpan`1,KeyReader`1,byref,byref) (7 methods)
         -60 (-0.90% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:Clone():Tensor`1:this (7 methods)
         -60 (-0.88% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:ToCompressedSparseTensor():CompressedSparseTensor`1:this (7 methods)
         -60 (-1.22% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:ToSparseTensor():SparseTensor`1:this (7 methods)
         -46 (-2.35% of base) : System.Private.CoreLib.dasm - Memory`1:CopyTo(Memory`1):this (7 methods)
         -45 (-0.70% of base) : System.Text.Json.dasm - JsonDocument:TryGetValue(int,byref):bool:this (15 methods)
         -43 (-2.24% of base) : System.Private.CoreLib.dasm - ReadOnlyMemory`1:CopyTo(Memory`1):this (7 methods)
         -42 (-1.58% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:TryFindIndex(int,int,byref):bool:this (7 methods)
         -42 (-2.03% of base) : System.Private.CoreLib.dasm - Memory`1:TryCopyTo(Memory`1):bool:this (7 methods)
         -39 (-1.92% of base) : System.Private.CoreLib.dasm - ReadOnlyMemory`1:TryCopyTo(Memory`1):bool:this (7 methods)
         -23 (-1.24% of base) : System.Reflection.MetadataLoadContext.dasm - EcmaAssembly:ComputeAssemblyReferences():ref:this
         -21 (-0.19% of base) : Microsoft.Extensions.Logging.Console.dasm - JsonConsoleFormatter:Write(byref,IExternalScopeProvider,TextWriter):this (7 methods)
         -21 (-1.75% of base) : System.Numerics.Tensors.dasm - CompressedSparseTensor`1:InsertAt(int,ubyte,int,int):this
Top method regressions (percentages):
           9 ( 0.44% of base) : System.Security.Cryptography.Pkcs.dasm - ManagedPkcsPal:Decode(ReadOnlySpan`1,byref,byref,byref,byref,byref):DecryptorPal:this
Top method improvements (percentages):
          -8 (-53.33% of base) : System.Private.CoreLib.dasm - ManualResetEventSlim:get_IsSet():bool:this
          -4 (-28.57% of base) : System.Reflection.Metadata.dasm - MetadataTokens:GetToken(EntityHandle):int
          -4 (-23.53% of base) : System.Reflection.Metadata.dasm - MetadataTokens:GetHeapOffset(BlobHandle):int
          -3 (-23.08% of base) : System.Private.CoreLib.dasm - MethodTable:get_HasComponentSize():bool:this
          -3 (-23.08% of base) : System.Reflection.Metadata.dasm - EntityHandle:get_IsVirtual():bool:this
          -3 (-23.08% of base) : System.Reflection.Metadata.dasm - AssemblyReferenceHandle:get_IsVirtual():bool:this
          -3 (-23.08% of base) : System.Reflection.Metadata.dasm - StringHandle:get_IsVirtual():bool:this
          -3 (-23.08% of base) : System.Reflection.Metadata.dasm - NamespaceDefinitionHandle:get_IsVirtual():bool:this
          -3 (-23.08% of base) : System.Reflection.Metadata.dasm - NamespaceDefinitionHandle:get_HasFullName():bool:this
          -3 (-23.08% of base) : System.Reflection.Metadata.dasm - BlobHandle:get_IsVirtual():bool:this
          -3 (-21.43% of base) : System.Private.Xml.dasm - XsdDuration:get_IsNegative():bool:this
          -3 (-21.43% of base) : System.Reflection.Metadata.dasm - AssemblyReference:get_IsVirtual():bool:this
          -3 (-18.75% of base) : System.Private.CoreLib.dasm - RuntimeHelpers:ObjectHasComponentSize(Object):bool
          -4 (-18.18% of base) : System.Reflection.Metadata.dasm - MetadataTokens:GetRowNumber(EntityHandle):int
          -4 (-18.18% of base) : System.Reflection.Metadata.dasm - MetadataTokens:GetHeapOffset(StringHandle):int
          -4 (-14.29% of base) : System.Reflection.Metadata.dasm - DocumentNameBlobHandle:op_Explicit(BlobHandle):DocumentNameBlobHandle
          -6 (-13.64% of base) : System.Private.CoreLib.dasm - SpinLock:Exit(bool):this
          -4 (-10.81% of base) : System.Reflection.Metadata.dasm - StringHeap:GetString(StringHandle,MetadataStringDecoder):String:this
          -4 (-9.09% of base) : System.Reflection.Metadata.dasm - AssemblyReference:get_PublicKeyOrToken():BlobHandle:this
          -3 (-8.33% of base) : System.Private.CoreLib.dasm - RuntimeHelpers:GetRawObjectDataSize(Object):long
828 total methods with Code Size differences (827 improved, 1 regressed), 258028 unchanged.
Completed analysis in 26.51s
PS C:\prj>