Open matthchr opened 5 years ago
Is the problem reproducible on the result of pickling and then unpickling your tree? If this is the case, you could attach the pickle of your tree to this issue and we would be able to analyse the tree. Otherwise, you have a way to mostly repair your tree with this approach. Of course, it is possible that your tree can no longer be pickled.
Note that you should also tell which ZODB version you are using.
A BTree consists of two parts: a tree form access structure and a linked list of buckets. Your observation indicates that the access structure is damaged. Fortunately, the entire relevant information is contained in the linked bucket list. You can rebuild the tree from this list: create a new tree, iterate over the bucket list and put their content into the new tree. Thus, you willl be able to recover. Nevertheless, it might be interesting to learn in what way the structure is damaged.
@d-maurer thanks for the quick response. We're using ZODB 5.5.1, BTrees 4.5.1
Yes, the problem persists through pickling -- even modifying the broken BTree in a variety of ways (such as removing other items and then re-persisting it) doesn't make the issue go away.
I've attached the BTree in question saved into a basic ZODB with the same version mentioned above, located at matthchr.zip -- unzip it and it contains matthchr.db
.
You can crack it open like so:
zodb = ZODB.DB('matthchr.db')
conn = zodb.open()
repro = conn.root()['a']
I'm not super interested in repairing this (there's not any critical data in the BTree which I am afraid of losing) -- more in understanding either what we did wrong to cause this, or if there is a bug in ZODB/BTrees which needs to be fixed.
Matthew Christopher wrote at 2019-3-15 14:59 -0700:
@d-maurer thanks for the quick response. We're using ZODB 5.5.1, BTrees 4.5.1
I've attached the BTree in question saved into a basic ZODB with the same version mentioned above, located at matthchr.zip -- unzip it and it contains
matthchr.db
.You can crack it open like so:
zodb = ZODB.DB('matthchr.db') conn = zodb.open() repro = conn.root()['a']
I'm not super interested in repairing this (there's not any critical data in the BTree which I am afraid of losing) -- more in understanding either what we did wrong to cause this, or if there is a bug in ZODB/BTrees which needs to be fixed.
I have found what is wrong with your tree.
From the documentation (--> "Development.txt" in the "BTrees" package):
In your tree, this condition is violated: the last bucket is empty.
You can verify this by looking at its state
repro.__getstate__()
. This is a pair where the first element
represents the first level structure of the tree and the second
element is the reference to the first bucket.
The structure is represented by a tuple of the
form "b_0, s_1, b_1, ..., b_n-1, s_n, b_n" where the "b_i"
are buckets and the "s_i" separators.
"repro.getstate()[0][-1]" gives the empty "BTrees.OOBTree.OOBucket([])".
That you see such an inconsistency likely indicates a bug in the "BTrees" code. The static state does not allow to determine with certainty what this bug could be but as conflict resolution is particularly difficult (and "BTrees" try to resolve conflicts aggressively), a conflict resolution problem could be a candidate.
Is you tree used in an environment with high conflict risk (i.e. many concurrent writes)?
Matthew Christopher wrote at 2019-3-15 14:59 -0700:
... We're using ZODB 5.5.1, BTrees 4.5.1
... and python 3.
The instance we saw this bug may have had concurrent writes. We actually do our best to avoid them via locking but after I saw this I went and looked and realized there's at least once place where we wern't locking properly and could've allowed a concurrent write (add w/ simultaneous delete).
I don't think many would be true though. At the most, 2-3 concurrent writes would've happened, certainly not more than that.
I've seen 5-6 additional repro's of this issue since I started looking for it, I'll make some changes to improve our locking and see if that resolves the problem, as a concurrency issue was what it felt like to me as well...
Matthew Christopher wrote at 2019-3-15 17:15 -0700:
... I've seen 5-6 additional repro's of this issue since I started looking for it, I'll make some changes to improve our locking and see if that resolves the problem, as a concurrency issue was what it felt like to me as well...
I have tried to detect weaknesses in the conflict resolution which might lead to an empty bucket -- without success. The current "BTrees" version seems to detect these cases reliably and refuse to resolve the conflict in case of a risk.
Have your broken trees been created with a different "BTrees" version or did they go through a Python 2 --> Python 3 migration?
Matthew Christopher wrote at 2019-3-15 17:15 -0700:
... I've seen 5-6 additional repro's of this issue since I started looking for it, I'll make some changes to improve our locking and see if that resolves the problem, as a concurrency issue was what it felt like to me as well...
The inconsistency of your tree is detected by
BTrees.check.check
:
tree = c.root()["a"] check(tree) Traceback (most recent call last): File "
", line 1, in File "/home/dieter/tmp/xmlsec3/python/lib/python3.6/site-packages/BTrees/check.py", line 424, in check Checker(btree).check() File "/home/dieter/tmp/xmlsec3/python/lib/python3.6/site-packages/BTrees/check.py", line 338, in check raise AssertionError(s) AssertionError: Errors found in OOBTree (0xb718d70c oid=b'0xd1'): key 'job-0000000386' >= upper bound 'job-0000000371' at index 1, in OOBucket (0xb71b341c oid=b'0xd8'), path from root 6
This indicates that the empty bucket might not be the primary problem but caused by the key "job-0000000386" being inserted into the wrong bucket.
I have used dm.historical
to determine the history of this
broken bucket:
from dm.historical import getHistory from pprint import pprint as pp ts = tree.getstate() t1=ts[0] bb = t1[-3] bb BTrees.OOBTree.OOBucket([('job-0000000285', []), ('job-0000000386', [])]) pp(getHistory(bb)) 'obj': BTrees.OOBTree.OOBucket([('job-0000000285', []), ('job-0000000386', [])]), 'size': 88, 'tid': b'\x03\xce`\x1c\xaf/\xcb3', 'time': DateTime('2019/03/15 22:48:41.059381 GMT+1'), 'username': ''}, {'description': '', 'obj': BTrees.OOBTree.OOBucket([('job-0000000285', []), ('job-0000000386', <persistent broken agent.task.data.JobPersistent instance b'\x00\x00\x00\x00\x00\x00\x00\xda'>)]), 'size': 134, 'tid': b'\x03\xce\xe3\xf2\xfa;\xaa', 'time': DateTime('2019/03/15 21:51:56.947845 GMT+1'), 'username': ''}, {'description': '', 'obj': BTrees.OOBTree.OOBucket([('job-0000000281', []), ('job-0000000285', []), ('job-0000000386', <persistent broken agent.task.data.JobPersistent instance b'\x00\x00\x00\x00\x00\x00\x00\xda'>)]), 'size': 158, 'tid': b'\x03\xce\xb73y\xc3\xbb', 'time': DateTime('2019/03/15 21:07:12.064604 GMT+1'), 'user_name': ''}]
Unfortunately, this means that the bucket has already been corrupted when the latest packing has happened ("getHistory" can reconstruct an object's history up to this point).
To find out more, you could periodically run check
over your trees in order to detect inconsistencies within a
short delay (at least, before you pack the storage).
You could then use dm.historical
to determine the history
of the broken bucket. This way, you should be able to
determine the transaction which has damaged the bucket.
You should also be able to determine the previous state[s]
for this bucket and the containing tree.
Comparing the damaged state with its predecessor(s), we might
be able to guess which tree operation(s) has caused the damage.
The "description" field in the history above is always empty. This indicates, that you are not using "transaction metadata" (--> "Transaction.note"). You could use "Transaction.note" to provide information about what the transaction has done on a more global level.
All that information might make it possible to reproduce the damage and then understand why it happens.
Matthew Christopher wrote at 2019-3-15 17:15 -0700:
... I've seen 5-6 additional repro's of this issue since I started looking for it, I'll make some changes to improve our locking and see if that resolves the problem, as a concurrency issue was what it felt like to me as well...
The state of your corrupted tree could have been caused by two concurrent inserts where one insert splits a bucket after the other insert has decided to insert into this bucket.
Usually, the ZODB prevents concurrency problems by allowing
each thread to use its own private copy of a persistent object
and check for potential concurrency problems during commit.
In case of a potential concurrency problem, it raises a
ConflictError
and the potentially conflicting transaction is
usually retried.
This approach will not work when the application passes on persistent objects (more precisely their copy destined for the local thread) across threads. In that case, multiple threads can modify the same copy concurrently and easily create inconsistencies. Are you sure that your application does not do this? If multiple threads need access to the same persistent object, each one must get its own local copy by loading it (directly or indirectly) from a local (i.e. thread specific) ZODB connection.
The ZODB allows objects to resolve concurrency problems locally
(called "conflict resolution") - and the BTrees
buckets make
intensive use of this possibility.
Conflict resolution is intrinsically difficult as it has only local knowledge (it knows only a single conflicting objects - and does not know whether there are conflicts in other objects; it does not know in which context the object is used (e.g. whether the bucket forms part of a larger tree); it knows only partially the object's children (for persistent children it knows only that they are persistent and their oid and does not know its "value") and a resolved state should be correct for any conflicting transactions.
Old BTrees
versions had a conflict resolution which gave
wrong results for some important applications.
As far as I know, there are no such problems known for newer versions.
I have triggered "insert/split" conflicts and the conflict
resolution of the current BTrees
version (the one you are
currently using) has correctly detected that it cannot handle
this case. Thus, apparently, your broken trees should not
be the result of a conflict resolution bug.
I have a
BTree
which I am persisting viaZODB
. In thisBTree
I add and remove items over time. Somehow, I've managed to get aBTree
which has a key for a particular item, but raises aKeyError
if I try to actually access that item. Whatever happened to thisBTree
it's been persisted to myZODB
and even closing/reloading from the DB maintains the same behavior.In the below code-snippets,
repro
is theBTree
in question loaded fromZODB
.shows
Finally, this code raises a
KeyError
onjob-0000000386
:I stepped through a bit in a debugger and it seems that this item is the last item in the last bucket, and interestingly, I can get the item from the bucket directly -- the following code works fine and returns the object
t2
.Digging a bit more, it seems that calling
repro.maxKey()
triggers an access violation:-1073741819 (0xC0000005)
If you need more details about the structure of my
BTree
with repros the problem I can share them with you (unfortunately I am not able to construct a new one which reproduces this problem deterministically, but I have a copy of one from aZODB
)