jurismarches / luqum

A lucene query parser generating ElasticSearch queries and more !
Other
188 stars 40 forks source link

Avoid a endless loop on LuceneTreeTransformer #38

Closed AlfonsoMolina closed 4 years ago

AlfonsoMolina commented 5 years ago

In the case that the returned node is the initial one plus something (like going from "foo" to "foo OR oof") the class do not stop visiting the new nodes, and since the first one is not removed it keeps triggering the transformation.

A solution could be to analyze only the initial query and not the new ones that have been added.

alexgarel commented 5 years ago

hello @AlfonsoMolina,

Our current implementation is not good indeed. It tries to transform in place and its not good practice, it should construct a new tree instead… I have some ideas on how to do this, but no final though. I think the boltons remap could be a source of inspiration.

alexgarel commented 5 years ago

@AlfonsoMolina I forgot to mention that you can circumvent the bug by putting an attribute on the added node, and verifying the presence of this attribute in your method.

   def visite_xxx(node, parents):
       if not hasattr(node, "_new_node"):
          new_node = …
          setattr(new_node, "_new_node")
          return new_node
    return node

Something like that (maybe the attribute should be put on the children.

This is not a good fix, but it can help mitigate the problem.

AlfonsoMolina commented 5 years ago

Hi! For now, I made a workaround using the first search for just selecting the nodes that I am interested in and then looping through them and replacing them. It is also not pretty, but for my implementation it seems to work. If something fails again I will try your idea, thanks!

alexgarel commented 5 years ago

In the meantime we can keep this ticket open because it would be useful to provide a better alternative than the actual implementation.

alexgarel commented 4 years ago

Hi @AlfonsoMolina, the master branch now contains a new "visitor.py" file with a brand pattern for the TreeTransformer, which propose to clone the tree while you transform it. The clone_item and children setter, implemented on tree items, are helpful for this to be easy.

You can see some example of its usage in the code (eg in utils.py and auto_head_tail.py)