Open brizee opened 3 years ago
Right now we have an out the box solr configuration, so nothing is weighted at all. Definitly something we can look into, I guess for title and creator mainly?
Would it be either an option or just a hard-wired practice to list first Orwell where Orwell is the author, then where Orwell occurs in the title of a book, then Orwell when mentioned in a book. Where one has George + Orwell surely the combination of both should be listed first.
Seems to chime with the feedback yeah, probably something we keep under review with analytics. I wonder if it's possible to detect how often search terms appeared in each field? Maybe count the highlights, though that might be too broad a measure. A manual review of query types would probably suffice?
Having finally gotten around to reading the Help page (🙄) I actually think a key issue here is that we're defaulting to OR searching—ie, returning results that match any of the words in the search query—where users expect results to return an AND search.
This is how Google operates (https://support.google.com/websearch/answer/2466433?hl=en-PT&ref_topic=3081620) and I'd recommend that we follow that pattern—instead of having an explicit AND query, we should have an explicit OR query instead. This would better match user expectations and established patterns, as well as returning more relevant result sets by default.
I'm pretty convinced that isn't what Google does actually - I believe like us Google will return results that contain either but prefers results that contain both.
You'll quite often see Google results with a line informing you of this for more complex searches and the option to make it required:
Making it required just adjusts the search to include the term in quotes.
For simple searches this doesn't matter of course - their index is large enough that they'll easily fill many pages with results containing Fruit and Juice before moving to either or.
Weighted search is reasonably easy to add in Search.php:
// Use dismax
$dismax = $query->getDisMax();
$dismax->setQueryFields('title^5 creator^5 year^5 publisher^5 placeOfPublication^2 description topic');
Part of this feedback is that certain fields should be weighted more heavily in results - it looks like this IS possible in Solr - are we already doing this and/or should we consider it? :smile:
https://stackoverflow.com/questions/16404228/solr-high-priority-in-fields