Closed merisjames closed 5 years ago
This appears to be the case, @merisjames. I agree that this is problematic, and contributing to a larger than expected number of false hits. I'm wondering if the UofA Access link is also being indexed and impacting search results?
Item locations aren't searched (they aren't actually indexed), but it's possible that some bib-level location information is in the index. I'll have to take a look. I would assume that no location information should be searchable - people don't usually put a location in with their search terms (though they could, of course) - does the team feel that this assumption is correct? Should locations be removed, or simply weighted very low?
@merisjames Do you have an example search I can try? I thought "Alberta History French" would be one, but without quotes the results all seem relevant, and with quotes I get zero results.... Thanks.
I would argue that most of the top results for "Alberta History French" aren't relevant... The top ten results are all about French history, with "Alberta" appearing somewhere in the notes field, and dealing with location information. Similar results show for "alberta ??? history"... Lots of stuff about the history of ???, plus a note field mention of Alberta dealing with holdings information. Examples: alberta oil history (leaving aside quite a few results that don't have "alberta" anywhere that I can see) https://www.library.ualberta.ca/catalog/311959 https://www.library.ualberta.ca/catalog/88283 alberta immigrant history (again, lots with no mention of alberta: https://www.library.ualberta.ca/catalog/7114163) https://www.library.ualberta.ca/catalog/1640692 https://www.library.ualberta.ca/catalog/814461 https://www.library.ualberta.ca/catalog/2627094 https://www.library.ualberta.ca/catalog/2933169
Oh, I see sorry. Yes, Alberta History French* aren't relevant. So the note field should be weighted lower. Most of the examples given have "University of Alberta" in the notes field. The one that doesn't has "University of Alberta" in some other fields, though I'll have to check which of those are indexed for searching.
Certainly the notes field should be weighted lower. We don't want to remove it as the notes field needs to be searchable.
Recommendation: weight the notes field lower then reevaluate.
Hmm. I just experimented with lowering the weighting of the note field and it made no difference. On closer inspection, a sample record from the alberta history french search showed no indexed text containing the word "alberta". Given that a phrase search "alberta history french" returns no results, I wonder if the search is silently dropping the word alberta from the search in order to return results. I think this may be an aspect of the solr configuration that I don't really understand yet. I think this will have to be investigated as part of the summer projects.
If the search is dropping "Alberta", I think that's definitely an issue that requires further exploration, especially since there are results such as https://www.library.ualberta.ca/catalog/2092433 https://www.library.ualberta.ca/catalog/788141 https://www.library.ualberta.ca/catalog/6976 that I think should be privileged above results that lack a key search term.
Meris James, MLIS Public Service Librarian J.A. Weir Memorial Law Library meris@ualberta.ca
On Wed, Apr 6, 2016 at 9:35 AM, redlibrarian notifications@github.com wrote:
Hmm. I just experimented with lowering the weighting of the note field and it made no difference. On closer inspection, a sample record from the alberta history french search showed no indexed text containing the word "alberta". Given that a phrase search "alberta history french" returns no results, I wonder if the search is silently dropping the word alberta from the search in order to return results. I think this may be an aspect of the solr configuration that I don't really understand yet. I think this will have to be investigated as part of the summer projects.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ualbertalib/discovery/issues/872#issuecomment-206432873
Hm. Yes. I think there's something to this I'm not quite understanding. I'll have investigate further.
Moving this example query to the SOLR relevancy problems document so @pgwillia can investigate. Closing issue as nothing is currently not working as far as I can tell
Odd things happen in searches that contain terms included in location names. Certainly it’s not returning everything with location “University of Alberta Internet” or “University of Alberta BARD” but hits are definitely coming back for “Alberta History French” where that’s the only incidence of Alberta visible or likely. Is the location field being searched? This is very problematic for searches with the word “Alberta” in them. Also, items with “Alberta” in the Notes field (denoting location) are seeming to be privileged.