dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.91k stars 4.63k forks source link

RyuJIT/x86: REGMASK_BITS should be 16 for x86, not 32 #7575

Open BruceForstall opened 7 years ago

BruceForstall commented 7 years ago

regMaskSmall can be 16 bits, not 32, for RyuJIT/x86.

category:throughput theme:ir skill-level:intermediate cost:small

marcusturewicz commented 4 years ago

In which file should this be handled? https://github.com/dotnet/runtime/search?q=regMaskSmall&unscoped_q=regMaskSmall

BruceForstall commented 4 years ago

https://github.com/dotnet/runtime/blob/52832184d8ef7fe4398350783afa6520cb13310e/src/coreclr/src/jit/target.h#L33

Of course, making this change would require verifying that it indeed leads to memory usage and throughput improvements.

marcusturewicz commented 4 years ago

Ok cool. Is there a doc/guide I can follow for verifying improvements? Should I use BenchmarkDotNet or does this require something more lower level?

BruceForstall commented 4 years ago

We don't have something well defined, so the testing would be ad-hoc, e.g.:

  1. For memory, build a Release version of the JIT with MEASURE_MEM_ALLOC defined, and use COMPlus_JitMemStats=1 to view memory allocation. Compare memory allocation before and after the change.
  2. For compile-time, time crossgen before and after the change.

For both, a crossgen of the libraries in a generated Core_Root directory (the "framework libraries") would be an appropriate set of tests.

marcusturewicz commented 4 years ago

Ok I think I have done part 1a; I enabled MEASURE_MEM_ALLOC and turned on COMPlus_JitMemStats and built JIT with:

./build.cmd -subset clr -configuration release -os Windows_NT -arch x86

How can I see the memory allocation?

BruceForstall commented 4 years ago

It should be automatically output to stdout when you run a program (I haven't done this in a while...)

marcusturewicz commented 4 years ago

Ok, I've figured the memory part out. I used the compiled output of the above command and copied it into the published directory of a hello world app, as described in using your build.

 All allocations:
 For        23 methods:
   count:              20915 (avg     909 per method)
-  alloc size :      1172030 (avg   50957 per method)
+  alloc size :      1171206 (avg   50922 per method)
   max alloc  :         6144

   allocateMemory   :      2162688 (avg   94029 per method)
-  nraUsed    :      1212100 (avg   52700 per method)
+  nraUsed    :      1211276 (avg   52664 per method)

 Alloc'd bytes by kind:
                   kind |       size |     pct
   ---------------------+------------+--------
-         AssertionProp |      96504 |   8.23%
-               ASTNode |     285188 |  24.33%
-              InstDesc |      38204 |   3.26%
+         AssertionProp |      96504 |   8.24%
+               ASTNode |     285188 |  24.35%
+              InstDesc |      37536 |   3.20%
               ImpStack |       4464 |   0.38%
-            BasicBlock |      78308 |   6.68%
+            BasicBlock |      78308 |   6.69%
              fgArgInfo |       2772 |   0.24%
        fgArgInfoPtrArr |        984 |   0.08%
               FlowList |       7896 |   0.67%
@@ -25,9 +25,9 @@ Alloc'd bytes by kind:
          LSRA_Interval |      27060 |   2.31%
       LSRA_RefPosition |      81648 |   6.97%
           Reachability |        184 |   0.02%
-                   SSA |      26888 |   2.29%
-           ValueNumber |     190074 |  16.22%
-              LvaTable |      57596 |   4.91%
+                   SSA |      26888 |   2.30%
+           ValueNumber |     190074 |  16.23%
+              LvaTable |      57596 |   4.92%
             UnwindInfo |          0 |   0.00%
                 hashBv |       3652 |   0.31%
                 bitset |      27428 |   2.34%
@@ -39,14 +39,14 @@ Alloc'd bytes by kind:
           ArrayInfoMap |        484 |   0.04%
           MemoryPhiArg |        168 |   0.01%
                    CSE |      23404 |   2.00%
-                    GC |       1216 |   0.10%
+                    GC |       1060 |   0.09%
                 CorSig |       8684 |   0.74%
-              Inlining |      61100 |   5.21%
+              Inlining |      61100 |   5.22%
             ArrayStack |       1024 |   0.09%
              DebugInfo |       7312 |   0.62%
              DebugOnly |          0 |   0.00%
                Codegen |      12696 |   1.08%
-               LoopOpt |       8960 |   0.76%
+               LoopOpt |       8960 |   0.77%
              LoopHoist |       1208 |   0.10%
                Unknown |       8641 |   0.74%
             RangeCheck |        408 |   0.03%
@@ -60,33 +60,33 @@ Alloc'd bytes by kind:

 Largest method:
-count:       3732, size:     200677, max =       6144
-allocateMemory:     262144, nraUsed:     203340
+count:       3732, size:     200517, max =       6144
+allocateMemory:     262144, nraUsed:     203180

 Alloc'd bytes by kind:
                   kind |       size |     pct
   ---------------------+------------+--------
-         AssertionProp |       9636 |   4.80%
-               ASTNode |      54756 |  27.29%
-              InstDesc |       3824 |   1.91%
+         AssertionProp |       9636 |   4.81%
+               ASTNode |      54756 |  27.31%
+              InstDesc |       3736 |   1.86%
               ImpStack |        192 |   0.10%
-            BasicBlock |      23084 |  11.50%
+            BasicBlock |      23084 |  11.51%
              fgArgInfo |          0 |   0.00%
        fgArgInfoPtrArr |          0 |   0.00%
               FlowList |       2820 |   1.41%
      TreeStatementList |        288 |   0.14%
                SiScope |       1364 |   0.68%
        DominatorMemory |       1728 |   0.86%
-                  LSRA |       4784 |   2.38%
+                  LSRA |       4784 |   2.39%
          LSRA_Interval |       4136 |   2.06%
       LSRA_RefPosition |      11592 |   5.78%
           Reachability |         16 |   0.01%
                    SSA |       5268 |   2.63%
-           ValueNumber |      23304 |  11.61%
+           ValueNumber |      23304 |  11.62%
               LvaTable |       8852 |   4.41%
             UnwindInfo |          0 |   0.00%
                 hashBv |        764 |   0.38%
-                bitset |       5164 |   2.57%
+                bitset |       5164 |   2.58%
           FixedBitVect |        468 |   0.23%
                Generic |       7076 |   3.53%
    LocalAddressVisitor |          0 |   0.00%
@@ -95,7 +95,7 @@ Alloc'd bytes by kind:
           ArrayInfoMap |          0 |   0.00%
           MemoryPhiArg |          0 |   0.00%
                    CSE |       4428 |   2.21%
-                    GC |        400 |   0.20%
+                    GC |        328 |   0.16%
                 CorSig |       1924 |   0.96%
               Inlining |      16208 |   8.08%
             ArrayStack |        320 |   0.16%
marcusturewicz commented 4 years ago

For part 2 (crossgen tests), are you meaning that I should time how long the following command takes?

.\src\coreclr\build-test.cmd crossgen

https://github.com/dotnet/runtime/blob/master/docs/workflow/testing/coreclr/windows-test-instructions.md

marcusturewicz commented 4 years ago

Ok, think I've somewhat figured out the crossgen part. So I create the artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root directory with:

.\build-test.cmd release x86 generatelayoutonly

Then copy System.Private.CoreLib.dll from artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\IL to artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root.

The I run the following program to crossgen each System.*.dll library and time it:

using System;
using System.Diagnostics;
using System.IO;
using System.Linq;

namespace CrossGenTiming
{
    class Program
    {
        static void Main(string[] args)
        {
            // List dll's
            var coreRootPath = args[0];
            var libs = Directory.GetFiles(coreRootPath, "System.*.dll", SearchOption.TopDirectoryOnly).ToList();
            libs.RemoveAll(x=>x.Contains(".ni."));
            Console.WriteLine($"Found {libs.Count} libraries");

            // Setup stopwatch
            var sw = new Stopwatch();
            long time = 0;

            // CrossGen each library
            foreach (string lib in libs)
            {
                using var process = new Process();
                process.StartInfo.FileName = Path.Combine(coreRootPath, "crossgen.exe");
                process.StartInfo.Arguments = lib;
                process.StartInfo.WindowStyle = ProcessWindowStyle.Hidden;
                sw.Start();
                process.Start();
                process.WaitForExit();
                sw.Stop();
                time += sw.ElapsedMilliseconds;
                sw.Reset();
            }

            Console.WriteLine($"-------------------------------- CrossGen completed ---------------------------------");
            Console.WriteLine($"CrossGen took {time} ms");
        }
    }
}

Which produces the output below. CrossGen works for some libraries yet not for others. There seems to be a lot of missing libraries...

Found 195 libraries
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.AppContext.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Native image C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.AppContext.ni.dll generated successfully.
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Buffers.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Native image C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Buffers.ni.dll generated successfully.
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.CodeDom.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Error: Could not load file or assembly 'netstandard, Version=2.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51'. The system cannot find the file spec
ified. (0x80070002)
Error compiling C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.CodeDom.dll: Could not find or load a s
pecific file. (0x80131621)
Error: compilation failed for "C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.CodeDom.dll" (0x80131621
)
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.Concurrent.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Native image C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.Concurrent.ni.dll generated su
ccessfully.
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Native image C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.ni.dll generated successfully.
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.Immutable.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Error: Could not load file or assembly 'System.Runtime, Version=5.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'. The system cannot find the file s
pecified. (0x80070002)
Error compiling C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.Immutable.dll: Could not fi
nd or load a specific file. (0x80131621)
Error: compilation failed for "C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.Immutable.dl
l" (0x80131621)
C:\Source\github\dotnet\runtime\artifacts\tests\coreclr\Windows_NT.x86.Release\Tests\Core_Root\System.Collections.NonGeneric.dll
Microsoft (R) CoreCLR Native Image Generator - Version 4.5.30319.0
Copyright (c) Microsoft Corporation.  All rights reserved.

<Many lines removed for brevity>
-------------------------------- CrossGen completed ---------------------------------
CrossGen took 26173 ms

Where can I get the missing libraries from?

TIHan commented 10 months ago

Keeping this as future as x86 isn't a priority, but happy if anyone wants to tackle this.