aws / aws-sdk-net

The official AWS SDK for .NET. For more information on the AWS SDK for .NET, see our web site:
http://aws.amazon.com/sdkfornet/
Apache License 2.0
2.07k stars 860 forks source link

[v4] `CryptoUtil` refactoring opportunities. #3386

Open teo-tsirpanis opened 4 months ago

teo-tsirpanis commented 4 months ago

Describe the feature

I read the code in the CryptoUtil.cs file and noticed that there are some things that could be improved.

Use Case

Cleaning-up the codebase and improving maintainability.

Proposed Solution

Other Information

No response

Acknowledgements

AWS .NET SDK and/or Package version used

v4

Targeted .NET Platform

All

Operating System and version

All

normj commented 3 months ago

Definitely are areas if performance improvements now that we are moving up the minimum .NET version. Do you know if the System.IO.Hashing is any faster then the one the SDK uses?

MD5 is challenging for .NET Framework in a FIPS environment. Like you said we added it 9 years ago because you can't instantiate the MD5 objects in .NET Framework in a FIPS environment. We are not abandoning .NET Framework as part of V4 so don't see us switching away.

teo-tsirpanis commented 1 month ago

I ran some benchmarks for the CRC algorithms. For CRC32 the System.IO.Hashing implementation is significantly faster than aws-c-checksums on all tested sizes (16 bytes, 1KiB, 1Mib, 64Mib). System.IO.Hashing does not provide a CRC32C implementation but I wrote one with the help of the BitOperations.Crc32C method (introduced in .NET 8, will have to be polyfilled for earlier frameworks) and it was faster on small buffers and up to 6% slower on larger ones.

Results

``` BenchmarkDotNet v0.14.0, Windows 10 (10.0.19045.4842/22H2/2022Update) Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores .NET SDK 8.0.400 [Host] : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2 DefaultJob : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2 ``` | Method | Size | Mean | Error | StdDev | Median | Ratio | RatioSD | |--------------------- |--------- |------------------:|----------------:|----------------:|------------------:|------:|--------:| | **Crc32SystemIoHashing** | **16** | **5.690 ns** | **0.0648 ns** | **0.0575 ns** | **5.705 ns** | **0.04** | **0.00** | | Crc32AwsCrt | 16 | 143.815 ns | 2.9040 ns | 5.5252 ns | 141.474 ns | 1.00 | 0.05 | | | | | | | | | | | **Crc32SystemIoHashing** | **1024** | **52.922 ns** | **0.3838 ns** | **0.3205 ns** | **52.952 ns** | **0.08** | **0.00** | | Crc32AwsCrt | 1024 | 633.793 ns | 8.2753 ns | 7.3358 ns | 634.278 ns | 1.00 | 0.02 | | | | | | | | | | | **Crc32SystemIoHashing** | **1048576** | **55,368.987 ns** | **501.3493 ns** | **444.4331 ns** | **55,121.002 ns** | **0.11** | **0.00** | | Crc32AwsCrt | 1048576 | 515,646.456 ns | 3,622.6510 ns | 3,211.3858 ns | 516,361.914 ns | 1.00 | 0.01 | | | | | | | | | | | **Crc32SystemIoHashing** | **67108864** | **6,711,132.812 ns** | **130,469.2022 ns** | **169,646.8303 ns** | **6,697,514.844 ns** | **0.19** | **0.01** | | Crc32AwsCrt | 67108864 | 34,787,717.778 ns | 364,982.2117 ns | 341,404.5874 ns | 34,662,206.667 ns | 1.00 | 0.01 | | | | | | | | | | | **Crc32cBitOperations** | **16** | **2.303 ns** | **0.0748 ns** | **0.1096 ns** | **2.265 ns** | **0.02** | **0.00** | | Crc32cAwsCrt | 16 | 130.816 ns | 1.3143 ns | 1.2294 ns | 131.086 ns | 1.00 | 0.01 | | | | | | | | | | | **Crc32cBitOperations** | **1024** | **152.348 ns** | **1.4545 ns** | **1.2893 ns** | **152.631 ns** | **0.53** | **0.01** | | Crc32cAwsCrt | 1024 | 286.162 ns | 3.1830 ns | 2.8217 ns | 287.207 ns | 1.00 | 0.01 | | | | | | | | | | | **Crc32cBitOperations** | **1048576** | **169,818.129 ns** | **1,980.2265 ns** | **1,755.4192 ns** | **170,636.597 ns** | **1.03** | **0.02** | | Crc32cAwsCrt | 1048576 | 165,378.067 ns | 3,055.0144 ns | 3,000.4326 ns | 165,917.371 ns | 1.00 | 0.02 | | | | | | | | | | | **Crc32cBitOperations** | **67108864** | **12,268,516.198 ns** | **168,921.5765 ns** | **158,009.3476 ns** | **12,311,294.531 ns** | **1.06** | **0.02** | | Crc32cAwsCrt | 67108864 | 11,553,061.667 ns | 228,143.0248 ns | 213,405.1270 ns | 11,627,168.750 ns | 1.00 | 0.03 |

Code

```csharp using System.IO.Hashing; using System.Numerics; using System.Runtime.CompilerServices; using System.Runtime.InteropServices; using Aws.Crt.Checksums; using BenchmarkDotNet.Attributes; using BenchmarkDotNet.Configs; using BenchmarkDotNet.Running; BenchmarkRunner.Run(); [GroupBenchmarksBy(BenchmarkLogicalGroupRule.ByCategory)] public class Crc32Benchmark { [Params(16, 1024, 1 << 20, 1 << 26)] public int Size { get; set; } private byte[] _data = null!; [GlobalSetup] public void Setup() { _data = new byte[Size]; Random.Shared.NextBytes(_data); } [Benchmark] [BenchmarkCategory("Crc32")] public uint Crc32SystemIoHashing() { return Crc32.HashToUInt32(_data); } [Benchmark(Baseline = true)] [BenchmarkCategory("Crc32")] public uint Crc32AwsCrt() { return Crc.crc32(_data); } [Benchmark] [BenchmarkCategory("Crc32c")] public uint Crc32cBitOperations() { return Crc32cBitOperations(_data); } [Benchmark(Baseline = true)] [BenchmarkCategory("Crc32c")] public uint Crc32cAwsCrt() { return Crc.crc32c(_data); } private static uint Crc32cBitOperations(ReadOnlySpan buffer) { ref byte start = ref MemoryMarshal.GetReference(buffer); ref byte end = ref Unsafe.Add(ref start, buffer.Length); uint crc = 0; while (Unsafe.ByteOffset(ref start, ref end) >= sizeof(ulong)) { crc = BitOperations.Crc32C(crc, Unsafe.ReadUnaligned(ref start)); start = ref Unsafe.Add(ref start, sizeof(ulong)); } while (Unsafe.ByteOffset(ref start, ref end) >= sizeof(uint)) { crc = BitOperations.Crc32C(crc, Unsafe.ReadUnaligned(ref start)); start = ref Unsafe.Add(ref start, sizeof(uint)); } while (Unsafe.IsAddressLessThan(ref start, ref end)) { crc = BitOperations.Crc32C(crc, start); start = ref Unsafe.Add(ref start, 1); } return ~crc; } } ```
teo-tsirpanis commented 1 month ago

The system's MD5 implementation becomes increasingly faster on larger buffer sizes. Considering the FIPS implications, we could try using it first, and add fall back to the managed implementation if it throws.


BenchmarkDotNet v0.14.0, Windows 10 (10.0.19045.4842/22H2/2022Update)
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET SDK 8.0.400
  [Host]     : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2
  DefaultJob : .NET 8.0.8 (8.0.824.36612), X64 RyuJIT AVX2
Method Size Mean Error StdDev Median Ratio RatioSD
Md5System 16 425.6 ns 8.59 ns 17.74 ns 418.2 ns 0.98 0.04
Md5Managed 16 433.6 ns 5.85 ns 5.48 ns 432.6 ns 1.00 0.02
Md5System 1024 2,923.9 ns 58.19 ns 67.02 ns 2,935.5 ns 0.62 0.02
Md5Managed 1024 4,697.4 ns 91.26 ns 127.93 ns 4,677.5 ns 1.00 0.04
Md5System 1048576 2,546,400.4 ns 49,584.99 ns 50,920.13 ns 2,531,915.2 ns 0.61 0.02
Md5Managed 1048576 4,152,450.3 ns 78,730.29 ns 73,644.36 ns 4,173,361.3 ns 1.00 0.02
Md5System 67108864 163,596,564.7 ns 3,253,729.37 ns 3,341,340.51 ns 162,646,375.0 ns 0.58 0.01
Md5Managed 67108864 284,266,263.3 ns 5,055,510.86 ns 4,728,928.00 ns 284,251,250.0 ns 1.00 0.02