At the moment, client code specifies how to normalise/stem a term in the query viatransformQueryTerm. When running a query, expandTransformedQueryTerm produces the list of distinct transformations of a term (for any field), then they are all looked up in the index (irrespective of which field they came from).
A consequence of this is that if any field is stemmed, the query will return documents that match stemmed terms from the query, even if the documents mention the term only in non-stemmed fields. For example, suppose our documents are users, who have a name and a biography, and we stem the biography but not the name. Now a query like "Peters" will match a user whose name is "Peter", which might be undesirable.
See also the TODO in query. I don't have a clear picture of how to resolve this, other than by simply not stemming at all in indexes where this issue might be relevant.
At the moment, client code specifies how to normalise/stem a term in the query via
transformQueryTerm
. When running a query,expandTransformedQueryTerm
produces the list of distinct transformations of a term (for any field), then they are all looked up in the index (irrespective of which field they came from).A consequence of this is that if any field is stemmed, the query will return documents that match stemmed terms from the query, even if the documents mention the term only in non-stemmed fields. For example, suppose our documents are users, who have a name and a biography, and we stem the biography but not the name. Now a query like "Peters" will match a user whose name is "Peter", which might be undesirable.
See also the TODO in
query
. I don't have a clear picture of how to resolve this, other than by simply not stemming at all in indexes where this issue might be relevant.