Closed torvista closed 1 year ago
Disable Query Expansion in settings to prevent related items from being included in the results
I have that disabled, but still the description results show:
That's because after searching into the model, the search looks for "tt" "hmt" and "11" (even partial matches) on the product name/description. This is expected. eg I can see in the image that "11" or "htm" is present in some of the titles. Or if not, in the description.
I understand that it is looking for these sub-strings....but why is that built-in/hard-coded? Is the idea to "fill up" the results to the limit when nothing matches the complete string?
I understand that it is looking for these sub-strings....but why is that built-in/hard-coded? Is the idea to "fill up" the results to the limit when nothing matches the complete string?
Correct... the idea is always to find something relevant with the query, up to the maximum number of results to display. It might seem stretched in this case (especially since it's a model number), but in most scenarios it makes sense to search for partial strings and so on.
Ok, so what confused me was seeing the partial results disappear and be replaced by "bad" results. Maybe the default field order should be changed as I indicated....at least it is noted here now!
in most scenarios it makes sense to search for partial strings and so on.
Hmmm, I'm not sure about that, as my colleague thought it was a fault: getting a "related" result from a model code that does not exist. I have the query expansion turned off but a search using eg: rg-rad0295, still uses the wildcard clause, when I would expect this to be removed:
MATCH(pd.products_name) AGAINST('rg rad0295' IN BOOLEAN MODE) +
If I remove that from the query code, I get no results, as expected.
in most scenarios it makes sense to search for partial strings and so on.
Hmmm, I'm not sure about that, as my colleague thought it was a fault: getting a "related" result from a model code that does not exist. I have the query expansion turned off but a search using eg: rg-rad0295, still uses the wildcard clause, when I would expect this to be removed:
MATCH(pd.products_name) AGAINST('rg rad0295' IN BOOLEAN MODE) +
If I remove that from the query code, I get no results, as expected.
The model (aka SKU or product code) search is a bit tricky.
One solution is to remove substring search entirely like you said. This also affects the search for other fields.
Or you should recognize that the user is specifically searching a model and only then removing the substring search. But how do you do that? Not so trivial, especially when it's a model that doesn't exist. One possible way I can think of is to check the user query against a "pattern" model. This is assuming that all models for the store can be described by one or more patterns.
However, considering that in general a user searches by name/description and not by model, and since each store has its own SKU system (it's not a standardized field), I think this is an optional customization that is up to the store owner/developer, using observers or directly modifying the code.
Even with the above mod to remove the boolean, I'm still finding this behaviour when typing in a model number.
HNW- returns matches with the search term highlighted on the model field, as expected. HNW-E - no matches, as expected. HNW-EVO - returns matches in name and description, but nothing highlighted. It is picking up EVO in the name/description. Not expected: I expect no results.
Only when I add double quotes to the search term in the MATCH clause to force an exact match does it return nothing, as expected.
$sql = $db->bindVars($sql, ':searchQuery', '"' . $queryText . '"', 'string');
This I find confusing. Once there are no results, it should continue to return no results with a longer term. A search term is a filter, and once the results are zero, I think they should continue to be zero.
I've added code to generate a log to understand this better....
By default, searches use these field lists
When entering a partial model number, results from name-description are shown first, before any matching string in the model. This ordering is to be expected, due to the ordering of the fields.
What is confusing is when typing further characters of the model (which matches broad but not exact), the results become all description matches (which seem to match nothing of the search string) until the complete model is entered and the exact-model matches/is suddenly shown at the top of the list..
Changing the field order to
ensures the partial match stays in the list and gets to the top when there is a unique match/before exact match.
What is confusing is why there are results from descriptions that do not match the search string:
After adding some logging for the sql and results:
I see the boolean clause adds results based on the string being split, and matching tt or htm or 11 (why?) and also the natural relevance combine to produce 14 results, when I would have expected none.