CoreyKaylor / Lightning.NET

.NET library for LMDB key-value store
Other
398 stars 82 forks source link

Re-evaluate public API for possibly using Span overloads #116

Closed CoreyKaylor closed 4 years ago

CoreyKaylor commented 4 years ago

Where appropriate, Span overloads can provide more flexibility and potentially improve performance over forcing byte[].

AlgorithmsAreCool commented 4 years ago

I can probably help here, i've worked alot with Span in the last year

CoreyKaylor commented 4 years ago

You got it.

AlgorithmsAreCool commented 4 years ago

So giving this some thought,

I think it makes obvious sense to pass keys as ReadOnlySpan<byte> and also to pass Put values as ReadOnlySpan<byte>.

This allows the use of stackalloc (which doesn't require unsafe anymore 😃) or unmanaged buffers for interop puroposes.

There is another big oppurtunity however, Reads.

Today an array is allocated when returning data since the buffer LMDB provides is only valid during the scope of the transaction.

I think this is actually the easiest way to use the API, but for high performance scenarios it is very problematic.

Here are a few thoughts of how this might be improved. We can pick some of them or none if you don't like them.

Option 1. Preallocated Buffer

public enum GetResult
{
  Done,
  DestinationTooSmall,
  KeyNotFound
}

public class Transaction
{
  //probably needs a way to know how big the value is on failure  
  public GetResult TryGet(LightningDatabase db, ReadOnlySpan<byte> key, Span<byte> destination);
}

Option 2. Dynamic Allocator

public delegate Span<T> Allocator<T>(int sizeHint);
public class Transaction
{
  public bool TryGet(LightningDatabase db, ReadOnlySpan<byte> key, Allocator<T> allocator, out Span<T> value);
}

Option 3. Inspector function

public class Transaction
{
  public bool TryGet(LightningDatabase db, ReadOnlySpan<byte> key, TArg arg, SpanAction<byte, TArg> inspector);
}

4. Memory Pools

public class Transaction
{
  public class RentedBuffer : IDisposable
  {
    private int _length;
    private readonly byte[] _buffer;

    public ReadOnlyMemory<byte> Memory => _buffer.AsMemory(0, _length);
    public ReadOnlySpan<byte> Span => _buffer.AsSpan(0, _length);

    public void Dispose() => ...
  }

    public bool TryGet(LightningDatabase db, ReadOnlySpan<byte> key, out RentedBuffer value);
}
CoreyKaylor commented 4 years ago

I would lean towards option 1 myself until the added complexity is justified down the road. This library has always for me had a sweet spot of keeping the api simple and straightforward. Happy to hear disagreements otherwise though.

AlgorithmsAreCool commented 4 years ago

I have also enjoyed how easy this library is to use.

Looking at option 1 what do you think of this API shape? It deals with the problem of not knowing how big to make your buffer after failure

(although the names don't feel very good to me 😕)

public enum GetStatus //names are hard...
{
  Done,
  DestinationTooSmall,
  KeyNotFound
}

public readonly struct GetResult {

  public GetStatus Status { get; }

  // Meaning of Length depends on Status value  
  // Done                 -> The number of bytes writtern to destination span
  // DestinationTooSmall  -> The size of the value in the database
  // KeyNotFound          -> Set to 0 and has no meaning   
  public int Length { get; } //Name could also be 'ValueLength'
}

public class Transaction
{
  public GetResult TryGet(LightningDatabase db, ReadOnlySpan<byte> key, Span<byte> destination);
}

As an aside, I 'feel' like option 3 would be really cool, but there would have to be a big perf win for me to push for it, since it is rather exotic.

CoreyKaylor commented 4 years ago

I think the API changes look good personally. I'm game if we keep 3 separate from the other changes. Also, unless I'm thinking about this naively (quite possible)... 3 can be an extensions of 1, right?

AlgorithmsAreCool commented 4 years ago

Yeah, I don't intend to bring option 3 into this. I'll make it a separate issue that can be decided later.

I'll work on a PR for option 1 tonight or tomorrow.

AlgorithmsAreCool commented 4 years ago

Todos:

AlgorithmsAreCool commented 4 years ago

Fyi, I'm taking a couple of extra days to ensure I'm handling pinning correctly, I don't want to introduce a subtle bug.

CoreyKaylor commented 4 years ago

@AlgorithmsAreCool how are things coming?

AlgorithmsAreCool commented 4 years ago

Sorry, this week got hectic at work. Expect a PR by tomorrow

AlgorithmsAreCool commented 4 years ago

I didn't specify time-zone btw, 😅. But down to just docs now.

AlgorithmsAreCool commented 4 years ago

Remaining open questions