EgorBot / runtime-utils

MIT License
0 stars 1 forks source link

EgorBot for EgorBo in #109570 #147

Open EgorBot opened 2 weeks ago

EgorBot commented 2 weeks ago

Processing https://github.com/dotnet/runtime/pull/109570#issuecomment-2458281137 command:

Command -intel -amd --envvars DOTNET_JitDisasm:Zero256_Align64 ```cs using System.Runtime.InteropServices; using BenchmarkDotNet.Attributes; using System.Runtime.CompilerServices; using BenchmarkDotNet.Running; BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args); public unsafe class Bench { static byte* _srcAlign64; static byte* _srcAlign8; [GlobalSetup] public void Setup() { _srcAlign64 = (byte*)NativeMemory.AlignedAlloc(1024 * 1024, 64); _srcAlign8 = _srcAlign64 + 8; } [GlobalCleanup] public void Cleanup() => NativeMemory.AlignedFree(_srcAlign64); [Benchmark] public void Zero256_Align64() => Unsafe.InitBlockUnaligned(_srcAlign64, 0, 256); [Benchmark] public void Zero128_Align64() => Unsafe.InitBlockUnaligned(_srcAlign64, 0, 128); [Benchmark] public void Zero64_Align64() => Unsafe.InitBlockUnaligned(_srcAlign64, 0, 64); [Benchmark] public void Zero20_Align64() => Unsafe.InitBlockUnaligned(_srcAlign64, 0, 20); [Benchmark] public void Zero256_Align8() => Unsafe.InitBlockUnaligned(_srcAlign8, 0, 256); [Benchmark] public void Zero128_Align8() => Unsafe.InitBlockUnaligned(_srcAlign8, 0, 128); [Benchmark] public void Zero64_Align8() => Unsafe.InitBlockUnaligned(_srcAlign8, 0, 64); [Benchmark] public void Zero20_Align8() => Unsafe.InitBlockUnaligned(_srcAlign8, 0, 20); } ```

(EgorBot will reply in this issue)

EgorBot commented 2 weeks ago

Benchmark results on linux-genoa

BenchmarkDotNet v0.14.0, Ubuntu 24.04 LTS (Noble Numbat)
AMD EPYC 9R14, 1 CPU, 16 logical and 16 physical cores
  Job-JETMWX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-ZSTWZB : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
EnvironmentVariables=DOTNET_JitDisasm=Zero256_Align64
Method Toolchain Mean Error Ratio
Zero256_Align64 Main 1.0433 ns 0.0007 ns 1.00
Zero256_Align64 PR 8.4146 ns 0.0026 ns 8.07
Zero128_Align64 Main 0.0363 ns 0.0130 ns 1.12
Zero128_Align64 PR 7.8806 ns 0.0014 ns 242.57
Zero64_Align64 Main 0.0002 ns 0.0002 ns ?
Zero64_Align64 PR 1.4106 ns 0.0059 ns ?
Zero20_Align64 Main 0.2728 ns 0.0006 ns 1.00
Zero20_Align64 PR 1.3129 ns 0.0020 ns 4.81
Zero256_Align8 Main 1.3508 ns 0.0031 ns 1.00
Zero256_Align8 PR 10.8843 ns 0.0133 ns 8.06
Zero128_Align8 Main 0.4998 ns 0.0023 ns 1.00
Zero128_Align8 PR 10.3437 ns 0.0051 ns 20.69
Zero64_Align8 Main 0.0002 ns 0.0002 ns ?
Zero64_Align8 PR 1.4034 ns 0.0028 ns ?
Zero20_Align8 Main 0.2734 ns 0.0012 ns 1.00
Zero20_Align8 PR 1.3146 ns 0.0007 ns 4.81

BDN_Artifacts.zip

EgorBot commented 2 weeks ago

cc @EgorBo (logs)

EgorBot commented 2 weeks ago

Benchmark results on linux-sapphirelake

BenchmarkDotNet v0.14.0, Ubuntu 24.04 LTS (Noble Numbat)
Intel Xeon Platinum 8488C, 1 CPU, 16 logical and 8 physical cores
  Job-SUFLFE : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-OIKMMJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
EnvironmentVariables=DOTNET_JitDisasm=Zero256_Align64
Method Toolchain Mean Error Ratio
Zero256_Align64 Main 0.0000 ns 0.0000 ns ?
Zero256_Align64 PR 7.6155 ns 0.0270 ns ?
Zero128_Align64 Main 0.0055 ns 0.0073 ns ?
Zero128_Align64 PR 0.9472 ns 0.0199 ns ?
Zero64_Align64 Main 0.0031 ns 0.0040 ns ?
Zero64_Align64 PR 0.9536 ns 0.0229 ns ?
Zero20_Align64 Main 0.2898 ns 0.0088 ns 1.00
Zero20_Align64 PR 0.8497 ns 0.0190 ns 2.93
Zero256_Align8 Main 0.4823 ns 0.0070 ns 1.00
Zero256_Align8 PR 7.7413 ns 0.1844 ns 16.05
Zero128_Align8 Main 0.0045 ns 0.0034 ns ?
Zero128_Align8 PR 1.0201 ns 0.0573 ns ?
Zero64_Align8 Main 0.0046 ns 0.0068 ns ?
Zero64_Align8 PR 0.9594 ns 0.0176 ns ?
Zero20_Align8 Main 0.2864 ns 0.0087 ns 1.00
Zero20_Align8 PR 0.8476 ns 0.0202 ns 2.96

BDN_Artifacts.zip

EgorBot commented 2 weeks ago

cc @EgorBo (logs)