kohler / masstree-beta

Beta release of Masstree.
Other
430 stars 115 forks source link

left-most leaf nodes not be freed after delete all keys #43

Open jonahgao opened 4 years ago

jonahgao commented 4 years ago

Steps to reproduce the behavior

  1. insert some keys
  2. delete all keys
  3. many nodes(leftmost leaf) remain in the tree, lead to memory consumption

Example Tree structure(After delete all keys):
green border: leaf node
coral color: layer root
tree

I wonder if we can do some checks and issue a gc_layer during remove_leaf. I try to fix it like this: https://github.com/jimdb-org/masstree-beta/commit/eabe31105a283a956ca84e309b9221add8f71575 , and the problem goes away.

idanlevy1234 commented 4 years ago

So the RC is deleting the second most left leaf while it has no next leaf and the most left leaf is already empty. If this is the only case which causing this issue, sounds like a good idea.

However, only the leaf's lock is being held at this point, which expose the code for at least 1 corner case where the gc_layer_rcu_callback is not called:

Consider the following case (A is the most left leaf): null<-A<->B<->C->null

A is empty, B and C have single key each. If B and C delete the last key in parallel, they both might reach your new code together (after unlink), and both will decide not to run gc_layer_rcu_callback.

Another case that needs deeper investigation: Consider the following case (A is the most left leaf): null<-A<->B->null

Same case as before (both delete the last key in parallel). A does not call gc_layer_rcu_callback as B still exists, while B reaches your new code. As no memory fence exists after removing the last key from A's permutation, what guarantee do we have that B will read the updated permutation of A ?

jonahgao commented 4 years ago

@idanlevy1234 :+1:
Yes, it only helps in most cases. Still need to work on a perfect fix.