Spreads / Spreads.LMDB

Low-level zero-overhead and the fastest LMDB .NET wrapper with some additional native methods useful for Spreads
http://docs.dataspreads.io/spreads/libs/lmdb/api/README.html
Mozilla Public License 2.0
80 stars 9 forks source link

Why no DirectBuffer overloads for PutAsync() ? #41

Closed BernMcCarty closed 4 years ago

BernMcCarty commented 4 years ago

And a follow-up question is: If I use Put() (not PutAsync) and I am using disableAsync: false in my environment, will the Put() go through the blocking queue as with PutAsync()?

I am using async/await and also using DirectBuffer and Put() and I get random crashes. All of the Spans under my DirectBuffers are on the stack so they are unaffected by GC. Version="2020.0.114"

buybackoff commented 4 years ago

I noticed that PutAsync is just Task.Run over Put and it should not be that way. Obviously this doesn't use the blocking queue. This method should not exists.

Do you really need background writes? I was thinking to remove this all together because the performance is not great, it's very hard to avoid allocations in that case. Or it will require quite complex rewrite as mentioned in #31.

I am using async/await and also using DirectBuffer and Put() and I get random crashes.

If a write transaction starts on one thread and then jumps to another thread due to async/await - that is not supported by LMDB.

All of the Spans under my DirectBuffers are on the stack so they are unaffected by GC.

Do you use stackalloc and your memory is on the stack, or only DirectBuffers are on the stack?

BernMcCarty commented 4 years ago

I found my problem. I was so focused on my values that I forget about my Int64 keys. Once I placed the key bytes on the stack too everything started working. Now everything is really on the stack as I had claimed. I am using async/await only to enable me to fetch the next batch of data from mongo while I am putting the current batch into LMDB, but I am doing everything sync on the LMDB part and I just do one transaction per batch and it is working fine. I would like to understand your support for async better though (if you are going to keep going with it).

buybackoff commented 4 years ago

but I am doing everything sync on the LMDB part

Then you do not need background thread with the queue and async support from this library. That async support is needed if you use C# async/await inside a write transaction and the code inside such transaction could jump threads. I did that support "because I could", but in reality it's a very bad practice to keep LMDB transactions longer than needed, because this doesn't allow LMDB to reuse pages. And for single writer it is also useless.

So the crucial thing is to always do LMDB write transactions in one thread. In that case you are fine and should not care on which particular thread it's happening. Just do not jump threads inside transactions. Other than async/await this also applies to any blocking calls such as WaitHandles or locks.

Read transactions could span threads because MDB_NOTLS is forced in this (Spreads) library, as is mentioned in the readme.