Open ecstaticpeon opened 9 years ago
Ignore the ordering issue, this is actually related to our index. The question remain though: shouldn't using auto_query() end up using both stemmed and un-stemmed terms as well?
Thanks for using Xapian-Haystack and for reporting this here.
In principle I agree with the consistency you mentioned. However, I'm not sure this is what we want since the auto_query receives a query, not a term. E.g. what would be the stemmed version of Hello OR bye OR che*rs
?
See here what keywords it accepts.
Thanks for making Xapian-Haystack :)
As far as I understand, the query will be split by terms? And therefore stemming wil be applied to each of the terms when applicable?
I'm not sure the query is split in terms by Xapian-Haystack. In this line, the "term" is prepared
by haystack and sent to the backend to be interpreted (self.backend.parse_query(query)
). We just add the field_name:%s
to the term in case it is made on a specific field.
Can you point out where in the code it is split by terms?
I have an index with items containing the word "voyage", and others "voyager". When doing a search for "voyage" using
auto_query()
, the backend returns the items containing "voyager" first, although one would expect the items with "voyage" to be first. However, when usingfilter()
, the ordering appears correct (e.g. first "voyage", then "voyager").After doing some investigation, it looks like the query returned by
XapianSearchQuery.build_query()
is different depending on whetherauto_query()
orfilter()
is used:Looking at
XapianSearchQuery._filter_contains()
, which will be called when usingfilter()
, the docstring specify the search will be done on both the stemmed and un-stemmed term: "Splits the sentence in terms and join them with OR, using stemmed and un-stemmed."Shouldn't using
auto_query()
end up using both stemmed and un-stemmed terms as well?Versions used:
Xapian: 1.3.2 xapian-haystack: 3e8611265ec63522d4e3d81b45de3866f48853ee (from 12 January 2015).