aerospike / aerospike-client-csharp

Aerospike C# Client Library
70 stars 48 forks source link

perf: change ripmd160 from class to struct to avoid allocations. #121

Closed RokasBalevicius closed 6 months ago

RokasBalevicius commented 6 months ago

Main change: Ripemd160 hash calculator can be a struct as it is used only in one place and its instance is essentially ephemeral. No need to allocate it on heap.

Benchmark results:

BenchmarkDotNet v0.13.12, macOS Sonoma 14.4.1 (23E224) [Darwin 23.4.0]
Intel Core i7-1068NG7 CPU 2.30GHz, 1 CPU, 8 logical and 4 physical cores
.NET SDK 8.0.100
  [Host]     : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.0 (8.0.23.53103), X64 RyuJIT AVX2

| Method                     | Mean     | Error     | StdDev    | Gen0     | Gen1     | Allocated |
|--------------------------- |---------:|----------:|----------:|---------:|---------:|----------:|
| DefaultKeyCreation         | 7.182 ms | 0.1436 ms | 0.4025 ms | 984.3750 | 468.7500 |   3.97 MB |
| KeyCreationWithValueRipemd | 6.294 ms | 0.1123 ms | 0.1499 ms | 757.8125 | 375.0000 |   3.28 MB |

Benchmark code:

using Aerospike.Client;
using BenchmarkDotNet.Attributes;

[MemoryDiagnoser]
public class AeorospikeKeysBench
{
    private string[] _ids;

    public AeorospikeKeysBench()
    {
        var count = 10000;

        _ids = new string[count];

        for (int i = 0; i < count; i++)
        {
            _ids[i] = Guid.NewGuid().ToString();
        }
    }

    [Benchmark]
    public List<Key> DefaultKeyCreation()
    {
        var result = new List<Key>(_ids.Length);
        for (int i = 0; i < _ids.Length; i++)
        {
            var key = new Key("ns", "setName", _ids[i]);
            result.Add(key);
        }

        return result;
    }

    [Benchmark]
    public List<Key2> KeyCreationWithValueRipemd()
    {
        var result = new List<Key2>(_ids.Length);
        for (int i = 0; i < _ids.Length; i++)
        {
            var key = new Key2("ns", "setName", _ids[i]);
            result.Add(key);
        }

        return result;
    }
}

I used 10k keys, because we have use cases where we need to prepare thousands of keys in single request, so I just lazily took benchmarks I already had. Single key benchmark should show similar levels of improvement in allocations (a byte saved is a byte earned).

Key2 -> is copy of existing Key class, but uses the struct ValueRipemd160 instead of the class one to calculate the digest.

Secondary changes: I'm working on Mac using Rider, I noticed that such a setup is not readily supported by this project. I made some changes to make life easier for non windows developers. Namely added Rider related stuff to .gitignore and added build configs which do not build Windows specific projects.

I added all changes into single PR, because it felt like it will be quicker.

RokasBalevicius commented 6 months ago

@shannonklaus is there any timeline for comments and/or merge?

shannonklaus commented 6 months ago

@RokasBalevicius Yes, I am looking over this PR this week

shannonklaus commented 6 months ago

@RokasBalevicius Thank you for your pull request. It will be included in my next release, which will be mid June.