haproxytech / haproxy-spoa-dotnet

HAProxy Stream Processing Offload Agent (SPOA) library for .NET Core.
GNU General Public License v2.0
15 stars 7 forks source link

Explore implementation of System.IO.Pipelines #12

Open NickMRamirez opened 2 years ago

NickMRamirez commented 2 years ago

Starting a thread so that we can investigate how to improve the performance of this library by using the System.IO.Pipelines library, if appropriate.

NickMRamirez commented 2 years ago

Moving this comment from this thread to here: https://github.com/haproxytech/haproxy-spoa-dotnet/issues/7#issuecomment-881920684

I'm interested by the PR. A year ago, I also tried to use IDuplexPipe as a way to improve performance.

At the time but I think that didn't change that much, spoa-dotnet was synchronous in the way it handles frame and messages. It reads one frame, decodes, processes it. If one or more messages were enqueued, it writes frames in loop (which can be more than one depending the number of message and the size of them produced during processing). Doing so, reading frame is blocked and no parralell processing can be done. For me, this was a major perf issue because my processing step was to call an outside api and I/O path was not optimum.

But using Stream on top of IDuplexPipe, from my perf tests, didn't not bring that much of gain. From my perf test, this was not the main bottleneck. The fact, is that Frames are encoded and decoded using recursive ToArray. Which leads to a lot of object creation and lot of GC presure under heavy load.

I think a more efficient way would be to use Pipes and ReadOnlyBuffers/Spans.

The way Kestrel works with IDuplexPipe for HTTP Request is very threadish. Transposed to SPOA, IDuplexPipe would allow to have a thread handle a connection and reading the input pipe as much as it can (automatic flow control). When a complete frame is readable, it can be processed on the same thread (handshake/disconnect) or the payload can be written to another pipe (notify/unset). This pipe is a buffer for fragmented message (one or many fragment). Another thread is async reading the pipe and is notified only when the message payload is completely written on his pipe. It can decode the list of messages and process them async. Output frame can either be directly written to the output pipe part of the connection, or written by another thead (using Pipe or Message Buffer).

The real benefits of IDuplexPipe is to be able to read directly from the Pipe but it means being able to read and write frame from ReadOnlySequence/ReadOnlySpan to avoid as much as possible object allocation.

Reference project that uses IDuplexPipe: https://github.com/inulogic/HAProxy.StreamProcessingOffload.AgentFramework

NickMRamirez commented 2 years ago

Discussion points: