jhu-idc / iDC-general

Contains non-code-base specific tickets relating to the Islandora8 for Digital Collection project
0 stars 0 forks source link

Single and Multiple wild card search do not work as expected #494

Open htpvu opened 2 years ago

htpvu commented 2 years ago

Related to UAT-3 (https://docs.google.com/document/d/1-ZGnCSrrXFJZTvBqm4ytGfMO-2LMaf0T02BgnJnmgVg/edit#), test case 3.9 ad 3.10.

When the wildcard is used in the middle of the search string, the result return is not as expected (0-1 result vs a few hundreds expected. We suspect this behavior is caused by the UI implementation or Solr, but by something in between.

jabrah commented 2 years ago

Wildcard characters seem to work as expected only if they are at the end of a word, as opposed to placed in the middle of a word, as described above.

https://solr.apache.org/guide/8_11/the-standard-query-parser.html#wildcard-searches

A few search layers:

jhu-alistair commented 2 years ago

Will discuss with stakeholders to understand the goal here. Having a indeterminate wildcard as opposed to a truncation or character wildcard in boolean search seems very strange.

jhu-alistair commented 2 years ago

I think the authoritative statement of search behavior should be the search tips next to the search tools at https://digital.library.jhu.edu

Based on that description "?" should replace only a single character not any number of characters. Here is the whole set of tips.

Use double quotes (") to search as a phrase Use the ? wildcard character to search for words with one alternate character. For example, te?t should match test and text Use the wildcard character to search for words with multiple alternate characters. Searching for test should match test, tester, testing, etc Use the proximity search syntax if you want to search for two terms within a certain number of words of each other or you can add a proximity search term on the Advanced Search page. The term "farm goat"~10, including the quotes, should match an item that has the words "farm" and "goat" within 10 words of each other Use the AND, OR, NOT boolean operators in your searches, or combine search terms on the Advanced Search page. For example, in the global search, you can search for farm AND goat to look for items with both terms. If you manually enter these operators, make sure they are capitalized, as you see in this example