Open seandavi opened 5 years ago
Hi @seandavi
You're right this is not a supported scenario, but it is an interesting one.
Two solutions:
If you help in some way, just ask !
For the time being, I'm going the cheap route and specifying _all
as the default field for the match
query for now. Users seem happy with the basic query_string
behavior which appears to pretty much use the _all
approach.
If I have a little time, I may play with the multi-match
approach. If I get into trouble, I'll let you know.
As usual, thanks for taking the time to answer and clarify.
I know it has been a while on this one. I noticed a per-field version of multi_match
was recently implemented. I'd like to revisit the idea of multi_match
on a set of default fields for bare words. I like your idea of converting to multi_match when default_field is a list. Could you give me some hints on where to focus if I want to implement? No urgency, but I thought I would ask.
Just leaving a note here that to do this right would involve bare Word()
and Phrase()
, the latter requiring a different multi_match type.
After a little playing with luqum.utils.LuceneTreeTransformer
, this seems to do what I need. Note that multi_match is roughly translated to a bunch of OR queries across single-field match
. The same is true of multi_match with phrases, except that match_phrase
class BareTextTransformer(luqum.utils.LuceneTreeTransformer):
"""Convert bare Words or Phrases to full text search
In cases where a query string has bare text (no field
association), we want to construct a DSL query that includes
all fields in an OR configuration to perform the full
text search against all fields.
This class can walk the tree and convert bare Word
nodes into the required set of SearchField objects. Note
that this is entirely equivalent to `multi_match` in terms
of performance, etc.
"""
def __init__(self, fields=['title','abstract']):
"""Create a new BareTextTransformer
Parameters
----------
fields: list of str
This is the list of fields that will used to
create the composite SearchField objects that
will be OR'ed together to simulate full text
search.
Returns
-------
None. The tree is modified in place.
"""
super()
self.fields = fields
def visit_word(self, node, parent):
if(len(parent)>0 and (
isinstance(parent[-1], luqum.tree.SearchField) or
isinstance(parent[-1], luqum.tree.Range))):
return node
else:
search_list = [SearchField(f, node) for f in self.fields]
return Group(OrOperation(*search_list))
def visit_phrase(self, node, parent):
if(len(parent)>0 and (
isinstance(parent[-1], luqum.tree.SearchField) or
isinstance(parent[-1], luqum.tree.Range))):
return node
else:
search_list = [SearchField(f, node) for f in self.fields]
return Group(OrOperation(*search_list))
And, to use:
tree = parser.parse(q)
transformer = BareTextTransformer()
# tree below now has expanded Group(OrOperations....) for each
# field in the BareTextTransformer `fields`
tree = transformer.visit(tree)
Using a multi_match
for the *
field seems to work for me.
es_query_builder = ElasticsearchQueryBuilder(
**schema_analyzer.query_builder_options(),
field_options={"*": {"match_type": "multi_match"}},
)
Luqum is working great for me and my test users, but one thing that the test users miss is the behavior of query_string to do a full-text search across all fields when no field is specified (eg., "
London
") . I see the ability to specify a default fields, but this results in a simplematch
query. I guess I am looking to convert these tomulti-match
with all available text fields? Any suggestions?