EgorBot / runtime-utils

MIT License
0 stars 1 forks source link

EgorBot for EgorBo in #107304 #77

Open EgorBot opened 2 months ago

EgorBot commented 2 months ago

Processing https://github.com/dotnet/runtime/issues/107304#issuecomment-2331870986 command:

Command -amd -awsamd -commit 72f9ee0d26c23b3c58ec5af8aeea316095761646 vs previous -profiler --envvars DOTNET_JitDisasm:TrailingZeroCount_ulong ```cs using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Running; using System.Numerics; using System.Runtime.CompilerServices; public class Perf_BitOperations { static T[] Array(int count, int? seed = null) { var result = new T[count]; var random = new Random(42); if (typeof(T) == typeof(byte) || typeof(T) == typeof(sbyte)) random.NextBytes(Unsafe.As(result)); else for (int i = 0; i < result.Length; i++) result[i] = GenerateValue(random); return result; } static T GenerateValue(Random random) { if (typeof(T) == typeof(uint)) return (T)(object)(uint)random.Next(); if (typeof(T) == typeof(ulong)) return (T)(object)(ulong)random.Next(); throw new NotImplementedException(); } static uint[] input_uint = Array(1000); static ulong[] input_ulong = Array(1000); [Benchmark] public int TrailingZeroCount_ulong() { int sum = 0; ulong[] input = input_ulong; for (int i = 0; i < input.Length; i++) { sum += BitOperations.TrailingZeroCount(input[i]); } return sum; } } ```

(EgorBot will reply in this issue)

EgorBot commented 2 months ago

Benchmark results on c7a_4xlarge

BenchmarkDotNet v0.14.0, Ubuntu 24.04 LTS (Noble Numbat)
AMD EPYC 9R14, 1 CPU, 16 logical and 16 physical cores
  Job-CJHJBZ : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-HJWPTH : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
EnvironmentVariables=DOTNET_JitDisasm=TrailingZeroCount_ulong
Method Toolchain Mean Error Ratio
TrailingZeroCount_ulong Before 279.7 ns 0.11 ns 1.00
TrailingZeroCount_ulong After 279.8 ns 0.08 ns 1.00

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥 Hot asm: Main vs PR Hot functions: Main vs PR Counters: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

EgorBot commented 2 months ago

cc @EgorBo (logs)

EgorBot commented 2 months ago

Benchmark results on Amd

BenchmarkDotNet v0.14.0, Ubuntu 22.04.4 LTS (Jammy Jellyfish)
AMD EPYC 7763, 1 CPU, 16 logical and 8 physical cores
  Job-WVOZHL : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-PYVXXX : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX2
EnvironmentVariables=DOTNET_JitDisasm=TrailingZeroCount_ulong
Method Toolchain Mean Error Ratio
TrailingZeroCount_ulong Before 624.8 ns 0.66 ns 1.00
TrailingZeroCount_ulong After 624.6 ns 0.11 ns 1.00

BDN_Artifacts.zip

Flame graphs: Main vs PR 🔥 Hot asm: Main vs PR Hot functions: Main vs PR Counters: Main vs PR

For clean perf results, make sure you have just one [Benchmark] in your app.

EgorBot commented 2 months ago

cc @EgorBo (logs)