mgedmin / zodbbrowser

ZODB browser
http://launchpad.net/zodbbrowser
Other
19 stars 6 forks source link

BTree assertion failure in OOBTreeState.__init__ #33

Closed mgedmin closed 4 years ago

mgedmin commented 4 years ago

I have a Data.fs where trying to view http://localhost:8070/@@zodbbrowser?oid=0xc46&tid=0x3c71c61ba982f77&nohist results in

python2.7: BTrees/BTreeTemplate.c:1752: BTree_rangeSearch: Assertion `highbucket != NULL' failed. Nutraukta (aborted) (core dumped)

faulthandler shows this is happening in https://github.com/mgedmin/zodbbrowser/blob/45161aa3beba73183d295a1594172b269b8b4a2a/src/zodbbrowser/btreesupport.py#L114

The assertion itself is https://github.com/zopefoundation/BTrees/blob/4ae5aa00f152bd6bda76b8a68e84f1ae049c9001/BTrees/BTreeTemplate.c#L1752 (BTrees 4.6.0)

Looking at the latest revision of the tree does not fail, so something about my method of trying to rewind history is bogus.

mgedmin commented 4 years ago

Now I'm just speculating baselessly because I don't remember the data structures, but I think what I'm doing is restoring the state of all the BTree leaves, while keeping inner nodes untouched.

mgedmin commented 4 years ago

So, an OOBTree is a B+-Tree with internal nodes being other OOBTrees and all leaf nodes being OOBBuckets. The buckets are linked into a chain, in sorted key order. All keys and values live in buckets. Internal nodes always contain children of the same type (either other internal nodes or leaf buckets). Keys in internal nodes are used to guide binary searches and may or may not be actually present in the tree (e.g. if you remove an item, the key might stay in internal nodes, if no buckets were rebalanced).

So it should be enough to restore the linked list of the leaf buckets to make iteration possible. I'm not sure what the assertion is about.

mgedmin commented 4 years ago

Ok, it makes sense now.

The way iteration works is, a BTree first defines the iteration range as a pair (lowbucket, lowindex), (highbucket, highindex), and then follows the chain of buckets, stopping when it reaches highbucket.

You can iterate by specifying min/max bounds (which needs the binary tree search to find lowbucket/highbucket) or the entire BTree (the case we're interested in).

Finding lowbucket is easy: BTree has a reference to firstbucket.

There are two ways of finding highbucket: you can follow the nextbucket chain from firstbucket until you hit NULL (O(N)), or you can walk internal tree nodes by picking the rightmost child in each (O(log N)), which is what the code does.

If zodbrowser doesn't restore internal tree nodes, it can't call BTree iteration methods!

mgedmin commented 4 years ago

Here's a BTree I drew:

btree-diagram

You can see OOBTree objects (rectangles), OOBucket objects (circles), child pointers (down), nextbucket pointers (right from each bucket), firstbucket pointers (left from each internal node).