emory-libraries / librarysearch-enhance

2 stars 0 forks source link

Strange search results when searching non-Latin scrips #24

Open SofiaSlutskaya opened 2 years ago

SofiaSlutskaya commented 2 years ago

It's been reported through Report a problem form in the library catalog so I am passing it along. When search is performed in non-Latin script (arabic, russian), it produces results that are not at all related to the search. I was able to confirm that if result exists, it matches correctly, but it also displays other results that appear totally random. Here are two examples provided by a patron

https://search.library.emory.edu/?utf8=%E2%9C%93&search_field=keyword&q=%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B5+%D0%B4%D0%B5%D0%B2%D1%83%D1%88%D0%BA%D0%B8+%D0%BE%D1%87%D0%B5%D0%BD%D1%8C+%D0%BA%D1%80%D0%B0%D1%81%D0%B8%D0%B2%D1%8B%D0%B5

or https://search.library.emory.edu/?utf8=%E2%9C%93&search_field=keyword&q=%D0%A1+%D0%9D%D0%B5%D0%B2%D1%81%D0%BA%D0%BE%D0%B3%D0%BE+%D0%BD%D0%B0+%D0%9C%D0%BE%D0%BD%D0%BF%D0%B0%D1%80%D0%BD%D0%B0%D1%81++%D1%80%D1%83%D1%81%D1%81%D0%BA%D0%B8%D0%B5

In the first example, the results are not at all connected to the search string. There is NOTHING in our catalog, that matches this search string, so not results would have been expected behavior.

In the second example, the first match is correct, but the rest is unrelated.

There is an arabic example that the same user also provided and the behavior is the same.

I am not sure I understand why it is happening and what could be a solution, but it needs to be addressed and I am passing this along.

Thank you!

Additional context: Value score = 6; LibSComm patron impact score = 2

jnvitti commented 1 year ago

See also the following Japanese search: https://search.libraries.emory.edu/?utf8=%E2%9C%93&search_field=keyword&q=%E5%9B%BD%E6%96%87%E5%AD%A6+%3A+%E8%A7%A3%E9%87%88%E3%81%A8%E9%91%91%E8%B3%9E

The desired title is currently 3 in the list (transliterated as "Kokubungaku kaishaku to kanshō" (https://search.libraries.emory.edu/catalog/990007874580302486). I don't know whether the other Japanese results are very relevant, but the English results (5 & beyond) seem to have little or nothing to do with the original search terms (which are about Japanese literature).