jurismarches / luqum

A lucene query parser generating ElasticSearch queries and more !
Other
188 stars 40 forks source link

How are NOT and - (minus) different? #65

Closed impredicative closed 3 years ago

impredicative commented 3 years ago

How is foo NOT bar different from foo -bar? Should luqum process them identically? If not, why not?

alexgarel commented 3 years ago

In the Lucene Query language minus and NOT are the same.

That said, in luqum, in the parsing trees you have two different classes for each of them (see Not and Prohibit in tree.py). It's just in case you care and also to be able to reconstruct the expression as it was if needed. But fundamentally there is no difference.

It would have been better to make both class share a common ancestor, it was'nt done though, but it could be a good addition.

impredicative commented 3 years ago

Given a query, how can I convert all Prohibit to Not?

alexgarel commented 3 years ago

You may use a TreeTransformer, see visitor.py

Something like

from luqum.visitor import TreeTransformer
from luqum import tree

ProhibitTransformer(TreeTransformer):

    def visit_prohibit(self, node, context):
        new_node = tree.Not()
        children = self.clone_children(node, new_node, context)
        yield new_node

prohibit_to_not = ProhibitTransformer()

And use it like that:

my_not_tree = prohibit_to_not(my_prohibit_tree)
impredicative commented 3 years ago

@alexgarel In the code, you're not using children after assigning to it. Is this intended? If it is intended, it is convention for such unused variables to begin with a leading underscore. Alternatively, if it's intended, why even assign to it?

impredicative commented 3 years ago

@alexgarel Actually your code doesn't work at all.

new_node = tree.Not()

TypeError: init() missing 1 required positional argument: 'a'

Don't you think you should test your code before providing it?

alexgarel commented 3 years ago

Hi @impredicative, sorry, but I though it was explicit that I wrote Something like because I had not the time to test, but I should have been more explicit.

If you change to

    def visit_prohibit(self, node, context):
        new_node = tree.Not(None)
        new_node.children = self.clone_children(node, new_node, context)
        yield new_node

It may works (yes once again I don't have the time to test). You can take inspiration in visitor.py

Luqum is provided as is, so yes, if you wanna use it intensively you may have to be ready to read its source code, at least partially. Though every help and precise suggestion to improve the quality is welcome.

Hope this helps.

impredicative commented 3 years ago

Nope. It converts -foo to NOTfoo instead of to NOT foo. I have tried twice, I am tired of asking and of trying the very limited examples in the documentation. I give up, and will just stop using this package as soon as possible.

alexgarel commented 3 years ago

All right, no problem, the goal was to help with providing luqum, if the product helps as is.

Effectively I did not forsee the problem that with "-" you have no space after operator, while you need it .

You can correct it, by adding:

if not new_node.a.head:
    new_node.a.head = " "

(head and tail manage characters before and after expression)

You're right that we should document more about those details.