Fix transactions freed by deallocation in child process

callumwalker commented 5 months ago

Hi,

These patches prevent a child process removing a reader that was created in the parent process from the reader table on deallocation. This doesn't prevent any deliberate misuse of the transaction within a child process just unavoidable issues caused by deallocation e.g. when the child process exits.

Fixes #346

jnwatson commented 5 months ago

Thank you for your contribution. Could I bother you for a unit test?

jnwatson commented 5 months ago

Thank you for the unit test. That will be handy.

This raises a larger issue that I don't think your patch fixes. Having a subprocess use a transaction created by it parent in general is a bad idea. This is probably the source of multiprocessing-related hangs and crashes (e.g. #350).

The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork.

callumwalker commented 5 months ago

Hi,

Thanks for looking at my changes. I haven't addressed the issue of actually using a transaction post-fork as I wanted to keep the changes minimal.

The path forward I think is to hook the fork (e.g. via os.register_at_fork) and close the transaction pre-fork.

Closing the transaction pre-fork would mean that any long running transactions in the parent would need to be re-opened after the fork possibly giving a new view on the database.

I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process, so can I suggest that instead of closing the transaction pre-fork we add a post-fork hook in the child that adds a flag to the environment. We check that flag at the start of each method on the env/trans/cursor object and raise an exception if its set.

I'm happy to make the changes if you're open to them; it would mean that we could remove a python wrapper that achieves the same thing from our code.

jnwatson commented 5 months ago

I've been noodling this a bit. I appreciate your patience in resolving this.

I agree with your statement "I believe that it is safe for an LMDB transaction to exist across a fork as long as the transaction isn't then used in the child process". I wasn't considering this use case.

I'll note that upstream doesn't support using the forked LMDB environment by the child at all, so we're already in uncharted territory.

I'm in general concerned that we're not preserving the underlying mutex semantics. I was also concerned about leaking lock handles, though it seems that upstream thinks that's inevitable.

We have 3 use cases to consider for the child after a fork:

The LMDB environment has no active transactions. I'd really like to see this allow the child not to leak resource if the environment is closed. The child should be able to use a cached read transaction (but not the same one as from the parent).
The LMDB environment has active transactions and the child makes new transactions.
The LMDB environment has active transactions and the child uses an existing transaction. This should return an error.

In order to support all uses cases, I think we have to do a mix of your suggestions and mine. To support case 1, we should hook the fork and close the cached transaction in the parent only if it is not in use. This would prevent the transaction in the child from having to leak.

I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object. That means it can't use the cached transaction if it was created by the parent. Perhaps the easiest is on child post-fork-hook, we leak the cached transaction (i.e. set env->spare_txn to NULL). To catch invalid uses, we'd have to add a PID field to every LMDB object: cursor, transaction, etc., and compare it on every use to the current pid. I'm unsure if this is worth it except for perhaps in a debug mode.

I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided.

callumwalker commented 5 months ago

I'm less clear about what you mean by marking the environment. We still want to allow the child to operate on the environment object.

To catch invalid uses, we'd have to add a PID field to every LMDB object: cursor, transaction, etc., and compare it on every use to the current pid. I'm unsure if this is worth it except for perhaps in a debug mode.

I thought that using the environment wasn't allowed post-fork which is why I suggested marking the environment as invalid. If it is valid we could add a post-fork handler to mark the transactions/cursors invalid so we only have to check that flag on each request but I'm not fussed if that is done. E.g. (as python pseudo code)

# Assigned to in `trans_new` and `cursor_new`
_transactions = weakref.WeakSet()
_cursors = weakref.WeakSet()
def _invalidate_on_fork():
    for _trans in _transactions:
        _trans.valid = 0
    for _cursor in _cursors:
        _cursor.valid = 0

os.register_at_fork(_invalidate_on_fork)

I have a potentially simpler alternative I'd like to get your opinion on. The primary problem at least in your instance is use of the cached transaction. If we add an option of "multiprocess" at environment creation that simply disables caching of the transaction object, all this could be avoided.

We'd still need to avoid calling trans_clear(self); when calling trans_dealloc in a subprocess but that does remove most of the other changes I've made and I'm happy if we go with that approach.

jnwatson / py-lmdb

Fix transactions freed by deallocation in child process #363