jamescourtney / FlatSharp

Fast, idiomatic C# implementation of Flatbuffers
Apache License 2.0
497 stars 50 forks source link

Implementation of IInputBuffer for ReadOnlySequence<byte> #380

Closed goldsam closed 1 year ago

goldsam commented 1 year ago

Does it make sense to provided an implementation of IInputBuffer for ReadOnlySequence<byte>? Have you done any experimentation with this? Would the performance be too poor for smaller objects as compared to copying to a temporary contiguous array for smaller objects?

I was led down this path through investigating using System.IO.Pipelines for high performance consumption of data from a pipe or socket on Linux.

jamescourtney commented 1 year ago

Hi there!

I was recently looking at ReadOnlySequence<byte> for 64 bit support scenarios. It gets a little awkward since data can be spread across a Span<byte> boundary. I didn't make it as far as a full prototype (since I was investigating this as part of 64 bit support instead of on its own merits), but it does sound interesting. I suspect that performance will not be great since you'll need quite a bit of logic for dealing with Span boundaries, but I don't see why it wouldn't be possible. The only awkward thing might be when FlatSharp wants to return a ReadOnlyMemory<byte> for a [ ubyte ] vector. That would likely require a copy, which would be unexpected for Lazy/Progressive vectors. The other option is to throw an exception, which I think is my preference in the interest of explicitness.

jamescourtney commented 1 year ago

If you're interested in taking a shot at this, I'll be happy to do the PR. Otherwise, I might try it in the next week or so.

jamescourtney commented 1 year ago

I've taken a look at this. Unfortunately, it will require an interface change and some changes to the codegen. The reason is that FlatSharp sometimes asks the IInputBuffer for the entire Span<byte>, which of course ReadOnlySequence<byte> can't provide. We'd need to add a .GetSpan(int offset, int length) API to IInputBuffer, which would constitute a breaking change.

It might be a bit more than you're comfortable biting off, however.

jamescourtney commented 1 year ago

Hi again -- quick update. I've prototyped this, and the performance isn't good. The issue boils down to reading data out of the ReadOnlySequence<byte>. This structure isn't ideal for random access. My implementation relies on BufferExtensions.CopyTo for the heavy lifting:

// IInputBuffer.ReadInt
public int ReadInt(int offset)
{

    ReadOnlySequence<byte> seq = this.sequence;
    // Reasonable overhead
    Span<byte> temp = stackalloc byte[4];

    // This method call results in a 257 byte method as emitted by the JIT. This won't be fast.
    seq.Slice(offset, sizeof(int)).CopyTo(temp);

    // Fast
    return BinaryPrimitives.ReadInt32LittleEndian(temp);
}

You can see this here

ReadOnlySequence<byte> is really intended to be read left-to-right, not randomly. However, reading a FlatBuffer results in non-sequential jumps throughout the buffer, both forward and back. While I can build an IInputBuffer based on ReadOnlySequence, I'm leaning against including this in the library because I want the API to steer people to do the right thing.

In this case, it will almost certainly be faster to copy data from the sequence into a pooled array, and then hand that array to FlatSharp.

Do you have any thoughts here?

goldsam commented 1 year ago

I've been continuing to tinker with ReadOnlySequence<byte> (outside the context of FlatSharp) and came to the same conclusion. Your suggested solution is fast and idiomatic.

I do really appreciate you taking the time to investigate this. Also, thank you for this awesome library. 😊

jamescourtney commented 1 year ago

Also, thank you for this awesome library

Hey, thanks for the kind words!