lmdb: Txn.Reset clears the Txn finalizer when it should not

bmatsuo commented 7 years ago

The docs for Txn.Reset do not mention that the method clears the finalizer on the Txn object. They do indicate however that everything the finalizer is meant to prevent is still possible. So it does seem like retaining the finalizer until Abort is called is the correct action.

I probably did this because of a benchmark, without realizing the impact of it. So the performance impact should be quantified.

bmatsuo commented 7 years ago

It seems like this was done because the finalizer would cause segfault/panic when used with the sync.Pool type. This is a little counter-productive though because any Txn objects freed by a sync.Pool without a finalizer will leak resources and will not release their reader slot, causing the database to grow excessively.

The root cause of what I had determined to be the problem however, I now believe, is that the finalizer would fire after the Env had been closed.

This indicates that, with a finalizer enabled, naive use of a Txn with a sync.Pool type is not advised. And, at a minimum, when a sync.Pool type is going to be used it must be drained before the Env is closed.

for {
    txn, ok := txnPool.Get().(*lmdb.Txn)
    if ok {
        txn.Abort()
    } else {
        break
    }
}

err = env.Close()
//...

The most significant issue with the use of sync.Pool is that, because readers do not release their slots, the database will grow excessively on updates if a 'hot reader' ends up living for a very long time. It seems like it may be possible to avoid this by tracking Txn.ID() for every update and calling Txn.Abort() on any transactions which differ in their ID by some amount (like, by two or more). AFAIK the Txn.ID() only changes when an update transaction is committed. Following is the documentation for mdb_txn_id.

Return the transaction's ID.

This returns the identifier associated with this transaction. For a read-only transaction, this corresponds to the snapshot being read; concurrent readers will frequently have the same transaction ID.

Probably the most minor yet still annoying issue arising from the combination of sync.Pool and the Txn finalizer is that the finalizer logs a warning when it has to clean up a Txn, which is expected behavior inside a sync.Pool.

bmatsuo commented 7 years ago

If you want to be really technical about it modifying the finalizer behavior is not backwards compatible. But the package's stance is that finalizers on the types that it defines are entirely its domain and not part of the "public API". If a user were to clear or change the finalizer on a value allocated by the lmdb package the package does not guarantee them a way to change the finalizer back to an appropriate value if necessary.

So this is just a bug that could result in leaked resources and excessive volume utilization and it will be fixed in the v1.9.0 release (it is not a patch because this behavior is actually quite old).

lmb commented 7 years ago

FWIW, I think lmdb Txn should never be put into a sync.Pool.

From the doc of sync.Pool:

Pool's purpose is to cache allocated but unused items for later reuse, relieving pressure on the garbage collector. That is, it makes it easy to build efficient, thread-safe free lists. However, it is not suitable for all free lists. (emphasis mine)

I don't think it makes sense to tie reclaiming Txn to GC cycles. The application has to decide when / if this happens.

bmatsuo commented 7 years ago

@lmb, Thank you for your comment and concern regarding this issue.

I have read this documentation. While it is true, garbage collection pressure is not the issue as the pooled objects are allocated in C, my benchmarking and experience trying to increase performance of my raft-mdb fork has shown me that pooling Txn objects can lead to drastic performance benefits. My current understanding of the performance gain from pooling transactions is that the benefit is not in reduced memory allocation but reduced contention on the readlock table.

I have recently updated the lmdb benchmarks to include a reset/renew benchmark that accurately demonstrates what I have stated above. You may also look at my fork of raft-mdb, which has such benchmarks as well.

Currently, AFAICT, the largest potential downside of transaction pooling (provided that the finalizer is working) is that you may have idle readers in the pool when an update is committed which would lead to the collection of dirty pages until the transaction is either collected from the pool and finalized or removed from the pool and renewed. The dirty pages cause the database to grow but growth is not indefinite.

Further, my current implementation of lmdbpool.TxnPool allows the user to control how stale and outstanding readers are handled when updates are committed. This gives applications some level of tuning in regards to bounding the accumulation of dirty pages.

I don't intend on forcing transaction pooling onto users. Indeed If it gets integrated into the core lmdb package it probably will not even be the default mode. Though I do think integrating into the core package would produce the cleanest result, at the moment.

I do agree that, in general, complete deferral to GC for abortion of all transactions is a very bad idea. That is why I have the finalizer emit a serious warning when non-pooled transactions are aborted.

My belief is that if you carefully read the C documentation and the lmdbpool code you can become convinced that the implementation is not unsound (at least for some configurations and application use cases). If it does indeed turn out that a sync.Pool is not adequate then another custom pooling implementation could be experimented with. lmdbpool currently uses sync.Pool, yes. But the pool is an internal structure and could be swapped out for a different kind of pooling structure.

One of my driving concerns is that other users will try to implement pooling themselves and make really poor implementation choices. So I think that there should be an official/reference pooling implementation which does things as correctly as possible. Please let me know your thoughts and we can continue to work through this. But you may consider making a new issue to bring up new topics, given that I closed this issue and it is even somewhat off-topic here.

This issue was about the lmdb package clearing the finalizer when it absolutely should not (regardless of the use of sync.Pool).

lmb commented 7 years ago

I think we're misunderstanding each other.

Yes, free lists can be beneficial for lmdb Txns. I'm not saying never put Txn into a freelist. I'm saying never put a Txn into a sync.Pool. It makes no guarantees whatsoever about deallocation, and no guarantees that a finalizer is called when a Pool is GCed. This means you will leak LMDB reader slots over time, rendering the DB inaccessible.
You probably know this, but your benchmark shows such marked difference between Renew and Begin because you call SetMaxDBs with a very high number. I doubt that a lot of applications will use that many DBIs. E.g. I set a max of 5 DBIs, and BeginTxn takes 2000ns, while Renew takes 700ns.

bmatsuo commented 7 years ago

and no guarantees that a finalizer is called when a Pool is GCed.

I'm not sure what use case you are arguing for here. Why would you create many pools and allow them to be GCed over the course of your application life time? It seems to me that the common scenario is that a pool exists for the entire lifetime of an application.

If a pool is only GCed when the program terminates, then a call to Env.ReaderCheck when the program starts again would clear any reader slots that failed to be cleaned up from a pool on the previous execution. A real program should be calling Env.ReaderCheck anyway, so this is not really a big deal.

I do have a method, TxnPool.Close which does its best to flush the pool of any remaining transactions with the expectation that this method is called before Env.Close is called. Now I understand the documentation for Pool.Get do not guarantee that a pool can be flushed. My argument is that in practice is does actually work pretty well and Env.ReaderCheck is there to cover you anyway. Maybe it won't work for every application. Unfortunately, those applications can't use the TxnPool.

Now, it would be very bad if an idle txn could sit indefinitely inside a sync.Pool. I understand this very well. Write-heavy applications which only infrequently access the pool are at the most risk. But read heavy applications should actually cycle the pool effectively in practice.

In the end, I may implement a more strict pool that behaves more like a bounded queue/stack. sync.Pool was the easy first step. Ideally, I wouldn't mention sync.Pool at all in the lmdbpool documentation, but for the fact that is does have peculiarities.

you call SetMaxDBs with a very high number

I actually didn't remember setting MaxDBs to 64k. That was a strange decision I made years ago, and I should address that. But the performance benefits are not solely seen in these benchmarks. As I said, raft-mdb benchmarks also show the relevance of the change and those benchmarks has always called Env.SetMaxDBs with the value 2. Basically the whole txnpool implementation came out of trying to beat boltdb's read benchmarks for raft.

Also you are still talking about more than a 2x speedup with only 5 DBs. That is not insignificant.

lmb commented 7 years ago

Behind the scenes, a sync.Pool keeps a weak list of "things" per runtime thread, using a bit of black magic. Any time the GC is run (which you don't have control over), that weak list can be/is emptied. There are no destructors called, and finalizers aren't guaranteed to be called anyways. Have a look at the implementation, that might make things clearer.

That's why the doc stresses that sync.Pool is about reducing pressure on the GC. It's not meant to be a generic free list.

bmatsuo commented 7 years ago

My understanding is that finalizers aren't guaranteed to be called in general because of program termination. When the program terminates no attempt is made to GC, and thus no attempt is made to finalize any outstanding objects. As I stated, readers left open when a program terminates may not be ideal but they are not too harmful overall because Env.ReaderCheck handles them when the application starts.

The worst way to leak a transaction would be to do it in the middle of a long running program execution, as the leak cannot be detected by Env.ReaderCheck until after the program has exited. If this is truly possible then indeed use of the sync.Pool would not be a feasible.

However, my reading of the mailing lists has indicated that finalizers are supposed to be called for items in the pool. And in practice I have seen pooled items have their finalizer called. I am looking at the pool.go in the sync package now, and I don't see what about the implementation is making the references to pooled objects 'weak'. It seems like there is simply a runtime hook that releases pooled objects at the beginning of a GC (by assigning all internal references to nil). So it's not clear to me that this result in finalizers not being called.

https://golang.org/src/sync/pool.go#L211

If I do not see something in the source which you think is obvious, please show me.

lmb commented 7 years ago

My understanding is that finalizers aren't guaranteed to be called in general because of program termination.

You're right, I had misunderstood the notes on SetFinalizer, sorry!

I don't see what about the implementation is making the references to pooled objects 'weak'

The poolCleanup function does. It's a hack basically.

With that said, I still wouldn't use sync.Pool. Why do you want to tie the lifetime of the reserved transactions to the GC cycle?

bmatsuo commented 7 years ago

Why do you want to tie the lifetime of the reserved transactions to the GC cycle?

This is valid question. I have not implemented a custom free list because can be tricky to determine how many items to hold onto before additional items would be thrown away (aborted in the case of Txns).

For an lmdb.Env the maximum size of the pool is the value of MaxReaders (126 by default), which must be configured before Env.Open is called. So does it make sense that the free list be allowed to grow to MaxReaders size? Possibly. But maybe not. Can I issue an Env.Copy with MaxReaders in the free list? I doubt it.

I can implement a bounded free list so that the application developer is required to specify the maximum number of items it may hold. But I believe Go's general design philosophy is that application developers cannot determine a proper bound much better than I can as a package author, and they are prone to chose a bad value.

Again, I'm not against the idea of implementing a custom own free list. It's merely non-trivial.

I made #116 regarding this issue, please respond to this comment in that thread. Thanks.

bmatsuo / lmdb-go

lmdb: Txn.Reset clears the Txn finalizer when it should not #104