project-lux / lux-marklogic

Code, issues, and resources related to LUX MarkLogic
Other
3 stars 2 forks source link

Unable to search by Yale identifiers, such as ils:yul:mfhd:8752038, unless quoted (or in advanced search) #272

Open brent-hartwig opened 1 month ago

brent-hartwig commented 1 month ago

[!NOTE] This ticket slightly overlaps with project-lux/lux-middletier#93. Rob is interested in both but prioritizes this ticket over project-lux/lux-middletier#93.

Problem Description: A production log check surfaced the system failing to perform a keyword search on an ID:

2024-08-04 18:55:05.987 Notice: JS-JAVASCRIPT: throw new BadRequestError( -- Error running JavaScript request: Error: Unable to parse the search criteria: ils:yul:mfhd:8752038
2024-08-04 18:55:05.987 Notice:+in ../../lib/SearchCriteriaProcessor.mjs, at 9:5, in translateStringGrammarToJSON() [javascript]
2024-08-04 18:55:05.987 Notice:+in ../../lib/SearchCriteriaProcessor.mjs, at 798:24, in translateStringGrammarToJSON() [javascript]
2024-08-04 18:55:05.988 Warning: {"errorResponse":{"statusCode":400,"status":"Bad Request","messageCode":"BadRequestError","message":"Unable to parse the search criteria: ils:yul:mfhd:8752038"}}

Indeed, performing a simple search on unquoted string that contains at least two colons (without whitespace) into cts.parse results in an error: XDMP-UNEXPECTED: cts.parse('ils:yul:mfhd') -- Unexpected token syntax error, unexpected NameColon_

In this case, the user was trying to search by a known ID for https://lux.collections.yale.edu/view/object/5010d875-9095-4de7-a92e-32cb20e56453.

Expected Behavior/Solution: Workaround: quote the string in simple search or switch to advanced search.

If we do not want to require the user to quote the string, we could edit modify translateStringGrammarToJSON to auto-quote terms that contain two or more colons.

Requirements From UAT 8/26 - it was decided to go with the option of: auto-quoting search terms which do not have spaces.

Needed for promotion: If an item on the list is not needed, it should be crossed off but not removed.

UAT/LUX Examples:

Dependencies/Blocks:

N/A

Related Github Issues:

Related links:

N/A

Wireframe/Mockup: N/A

clarkepeterf commented 4 weeks ago

@azaroth42 based on this comment do we want to add a workaround for this in our ML code? A string with 2 or more colons breaks cts.parse, but we can look for terms with 2 or more colons and quote them

jffcamp commented 3 weeks ago

Do we understand the impact of the second option? "If we do not want to require the user to quote the string, we could edit modify translateStringGrammarToJSON to auto-quote terms that contain two or more colons."

brent-hartwig commented 3 weeks ago

@jffcamp, there would be no noticeable performance impact as we are already parsing the user-provided string. This would be a tweak to that logic: if a term (contiguous string of non-whitespace characters) includes two colons, add quotes. If you're asking about possible negative side effects, none come to mind at this time. The level of effort is low and it would be easy to back the change out --very isolated and in code that is only used to convert from the string search grammar to the JSON search grammar.

prowns commented 3 weeks ago

Per UAT - non-issue? Works with quotations and in adv. search

https://lux.collections.yale.edu/view/results/objects?q=%7B%22identifier%22%3A%22ils%3Ayul%3Amfhd%3A8752038%22%7D OR https://lux.collections.yale.edu/view/results/objects?q=%7B%22AND%22%3A%5B%7B%22text%22%3A%22ils%3Ayul%3Amfhd%3A8752038%22%2C%22_options%22%3A%5B%22punctuation-sensitive%22%2C%22unstemmed%22%2C%22unwildcarded%22%5D%7D%5D%7D

@brent-hartwig - Is there a search tag validation that could be causing the problem?

brent-hartwig commented 3 weeks ago

@prowns, this issue is limited to simple search. Advanced search doesn't go through cts.parse, which is the function that isn't happy with the double-colons. Apologies for not previously stating that. I since updated the title and description.

image

roamye commented 2 weeks ago

From 8/26 UAT:

Can we default any string which does not have spaces to be quoted? So for example, when a user searches for ils:yul:mfhd:8752038 it will be updated to "ils:yul:mfhd:8752038" so it can return the correct results. (Link Below)

https://lux.collections.yale.edu/view/results/objects?q=%7B%22text%22%3A%22ils%3Ayul%3Amfhd%3A8752038%22%2C%22_stemmed%22%3Afalse%2C%22_lang%22%3A%22en%22%7D&sq=%22ils%3Ayul%3Amfhd%3A8752038%22

brent-hartwig commented 2 weeks ago

I'd expect that would be just fine.

roamye commented 2 weeks ago

Great - I have moved this into backlog.