confluentinc / confluent-kafka-dotnet

Confluent's Apache Kafka .NET client
https://github.com/confluentinc/confluent-kafka-dotnet/wiki
Apache License 2.0
2.78k stars 847 forks source link

Use built-in "CreateReadOnlySpanFromNullTerminated" on NET6-specific path #2207

Open stevenaw opened 2 months ago

stevenaw commented 2 months ago

Adds a .NET6-specific implementation of PtrToStringUTF8 to use the built-in MemoryMarshal.CreateReadOnlySpanFromNullTerminated() function to satisfy the below existing inline comment in the source:

// TODO: Is there a built in / vectorized / better way to implement this?

This API was added in .NET6 for this purpose (when an interop call only returns a pointer and the length of the underlying null-terminated string is unknown) - https://github.com/dotnet/runtime/issues/40202. The internal source uses strlen() which internally delegates to SpanHelpers.IndexOfNullByte which is heavily optimized with vectorization or other special cases.

Performance is comparable. I have run the unit tests locally to validate the change but am unable to run the integration tests.

Benchmarks (using .NET8 runtime) ```pre | Method | Mean | Error | StdDev | Ratio | Gen0 | Gen1 | Allocated | Alloc Ratio | |--------- |---------:|---------:|---------:|------:|-------:|-------:|----------:|------------:| | Original | 12.05 ns | 0.110 ns | 0.103 ns | 1.00 | 0.0064 | 0.0000 | 40 B | 1.00 | | Updated | 11.70 ns | 0.042 ns | 0.040 ns | 0.97 | 0.0064 | - | 40 B | 1.00 | ``` ```csharp [MemoryDiagnoser] public class UnsafeIntPtrToStr { GCHandle handle; IntPtr strPtr; [GlobalSetup] public void Setup() { var strBytes = Encoding.UTF8.GetBytes("TestData"); byte[] strBytesNulTerminated = new byte[strBytes.Length + 1]; // initialized to all 0's. Array.Copy(strBytes, strBytesNulTerminated, strBytes.Length); handle = GCHandle.Alloc(strBytesNulTerminated, GCHandleType.Pinned); strPtr = handle.AddrOfPinnedObject(); } [Benchmark(Baseline = true)] public unsafe string Original() { #if NET6_0_OR_GREATER var bytes = MemoryMarshal.CreateReadOnlySpanFromNullTerminated((byte*)strPtr.ToPointer()); return Encoding.UTF8.GetString(bytes); #else // TODO: Is there a built in / vectorized / better way to implement this? byte* pTraverse = (byte*)strPtr; while (*pTraverse != 0) { pTraverse += 1; } var length = (int)(pTraverse - (byte*)strPtr); return Encoding.UTF8.GetString((byte*)strPtr.ToPointer(), length); #endif } [Benchmark] public unsafe string Updated() { #if NET6_0_OR_GREATER var bytes = MemoryMarshal.CreateReadOnlySpanFromNullTerminated((byte*)strPtr.ToPointer()); return Encoding.UTF8.GetString(bytes); #else // TODO: Is there a built in / vectorized / better way to implement this? byte* pTraverse = (byte*)strPtr; while (*pTraverse != 0) { pTraverse += 1; } var length = (int)(pTraverse - (byte*)strPtr); return Encoding.UTF8.GetString((byte*)strPtr.ToPointer(), length); #endif } [GlobalCleanup] public void Cleanup() { handle.Free(); strPtr = IntPtr.Zero; } } ``` ```
cla-assistant[bot] commented 2 months ago

CLA assistant check
All committers have signed the CLA.

cla-assistant[bot] commented 2 months ago

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.