caesar0301 / treelib

An efficient implementation of tree data structure in python 2/3.
http://treelib.readthedocs.io/en/latest/
Other
801 stars 185 forks source link

Shallow pastes to multiple trees breaks level/depth #88

Closed neverfox closed 4 years ago

neverfox commented 6 years ago
sent1 = Tree()
S1 = sent1.create_node('S')

sent2 = Tree()
S2 = sent2.create_node('S')

left = Tree()
NP = left.create_node('NP')

right = ParseTree()
VP = right.create_node('VP')

sent1.paste(S1.identifier, left)
sent1.paste(S1.identifier, right)
sent2.paste(S2.identifier, left)
sent2.paste(S2.identifier, right)

sent1.level(NP.identifier)
KeyError                                  Traceback (most recent call last)
~/.pyenv/versions/ling571/lib/python3.4/site-packages/treelib/tree.py in __getitem__(self, key)
    101         try:
--> 102             return self._nodes[key]
    103         except KeyError:

KeyError: '8dae1426-ff23-11e7-9421-34363bd0ba28'

During handling of the above exception, another exception occurred:

NodeIDAbsentError                         Traceback (most recent call last)
<ipython-input-416-169f4a8c57e3> in <module>()
----> 1 sent1.level(NP.identifier)

~/.pyenv/versions/ling571/lib/python3.4/site-packages/treelib/tree.py in level(self, nid, filter)
    424         exclusive nodes.
    425         """
--> 426         return len([n for n in self.rsearch(nid, filter)])-1
    427 
    428     def link_past_node(self, nid):

~/.pyenv/versions/ling571/lib/python3.4/site-packages/treelib/tree.py in <listcomp>(.0)
    424         exclusive nodes.
    425         """
--> 426         return len([n for n in self.rsearch(nid, filter)])-1
    427 
    428     def link_past_node(self, nid):

~/.pyenv/versions/ling571/lib/python3.4/site-packages/treelib/tree.py in rsearch(self, nid, filter)
    627         current = nid
    628         while current is not None:
--> 629             if filter(self[current]):
    630                 yield current
    631             # subtree() hasn't update the bpointer

~/.pyenv/versions/ling571/lib/python3.4/site-packages/treelib/tree.py in __getitem__(self, key)
    102             return self._nodes[key]
    103         except KeyError:
--> 104             raise NodeIDAbsentError("Node '%s' is not in the tree" % key)
    105 
    106     def __len__(self):

NodeIDAbsentError: Node '8dae1426-ff23-11e7-9421-34363bd0ba28' is not in the tree

During rsearch it appears to follow a back-pointer into the other copy. This makes sense, I suppose, given that the bpointer is set during pasting (and the last paste dominates). I'm not sure what the solution might be without storing multiple bpointers and then filtering them against nodes in the tree.

neverfox commented 6 years ago

Another possibility would be to give the user an option: if the shallow paste operation finds that a bpointer already exists, either error or auto-switch to deep copy (you might get away with only deep copying the root of the pasted tree). The user could indicate which option they prefer in an argument.

leonardbinet commented 4 years ago

This has been fixed with https://github.com/caesar0301/treelib/pull/129/

The issue was that nodes pointers were mixed up between trees.