ualbertalib / discovery

Discovery is the University of Alberta Libraries' catalogue interface, built using Blacklight
http://search.library.ualberta.ca
12 stars 3 forks source link

Search Terms v Locations #872

Closed merisjames closed 5 years ago

merisjames commented 8 years ago

Odd things happen in searches that contain terms included in location names. Certainly it’s not returning everything with location “University of Alberta Internet” or “University of Alberta BARD” but hits are definitely coming back for “Alberta History French” where that’s the only incidence of Alberta visible or likely. Is the location field being searched? This is very problematic for searches with the word “Alberta” in them. Also, items with “Alberta” in the Notes field (denoting location) are seeming to be privileged.

seanluyk commented 8 years ago

This appears to be the case, @merisjames. I agree that this is problematic, and contributing to a larger than expected number of false hits. I'm wondering if the UofA Access link is also being indexed and impacting search results?

ghost commented 8 years ago

Item locations aren't searched (they aren't actually indexed), but it's possible that some bib-level location information is in the index. I'll have to take a look. I would assume that no location information should be searchable - people don't usually put a location in with their search terms (though they could, of course) - does the team feel that this assumption is correct? Should locations be removed, or simply weighted very low?

ghost commented 8 years ago

@merisjames Do you have an example search I can try? I thought "Alberta History French" would be one, but without quotes the results all seem relevant, and with quotes I get zero results.... Thanks.

merisjames commented 8 years ago

I would argue that most of the top results for "Alberta History French" aren't relevant... The top ten results are all about French history, with "Alberta" appearing somewhere in the notes field, and dealing with location information. Similar results show for "alberta ??? history"... Lots of stuff about the history of ???, plus a note field mention of Alberta dealing with holdings information. Examples: alberta oil history (leaving aside quite a few results that don't have "alberta" anywhere that I can see) https://www.library.ualberta.ca/catalog/311959 https://www.library.ualberta.ca/catalog/88283 alberta immigrant history (again, lots with no mention of alberta: https://www.library.ualberta.ca/catalog/7114163) https://www.library.ualberta.ca/catalog/1640692 https://www.library.ualberta.ca/catalog/814461 https://www.library.ualberta.ca/catalog/2627094 https://www.library.ualberta.ca/catalog/2933169

ghost commented 8 years ago

Oh, I see sorry. Yes, Alberta History French* aren't relevant. So the note field should be weighted lower. Most of the examples given have "University of Alberta" in the notes field. The one that doesn't has "University of Alberta" in some other fields, though I'll have to check which of those are indexed for searching.

Certainly the notes field should be weighted lower. We don't want to remove it as the notes field needs to be searchable.

Recommendation: weight the notes field lower then reevaluate.

ghost commented 8 years ago

Hmm. I just experimented with lowering the weighting of the note field and it made no difference. On closer inspection, a sample record from the alberta history french search showed no indexed text containing the word "alberta". Given that a phrase search "alberta history french" returns no results, I wonder if the search is silently dropping the word alberta from the search in order to return results. I think this may be an aspect of the solr configuration that I don't really understand yet. I think this will have to be investigated as part of the summer projects.

merisjames commented 8 years ago

If the search is dropping "Alberta", I think that's definitely an issue that requires further exploration, especially since there are results such as https://www.library.ualberta.ca/catalog/2092433 https://www.library.ualberta.ca/catalog/788141 https://www.library.ualberta.ca/catalog/6976 that I think should be privileged above results that lack a key search term.

Meris James, MLIS Public Service Librarian J.A. Weir Memorial Law Library meris@ualberta.ca

On Wed, Apr 6, 2016 at 9:35 AM, redlibrarian notifications@github.com wrote:

Hmm. I just experimented with lowering the weighting of the note field and it made no difference. On closer inspection, a sample record from the alberta history french search showed no indexed text containing the word "alberta". Given that a phrase search "alberta history french" returns no results, I wonder if the search is silently dropping the word alberta from the search in order to return results. I think this may be an aspect of the solr configuration that I don't really understand yet. I think this will have to be investigated as part of the summer projects.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/ualbertalib/discovery/issues/872#issuecomment-206432873

ghost commented 8 years ago

Hm. Yes. I think there's something to this I'm not quite understanding. I'll have investigate further.

seanluyk commented 5 years ago

Moving this example query to the SOLR relevancy problems document so @pgwillia can investigate. Closing issue as nothing is currently not working as far as I can tell