DSpace / dspace-angular

DSpace User Interface built on Angular.io
https://wiki.lyrasis.org/display/DSDOC8x/
BSD 3-Clause "New" or "Revised" License
123 stars 394 forks source link

Colon (perhaps other signs as well) must be escaped in searches #1996

Open erpnedir opened 1 year ago

erpnedir commented 1 year ago

If you search a phrase that includes a colon sign, you get no result.

tdonohue commented 1 year ago

@erpnedir : If you surround the title with quotes, the search will work.

See https://demo7.dspace.org/search?query=%22my%20test%20with%20colon:%20title%201%22 (This is the same search as your example except using double quotes)

The reason that colons won't work outside of quotes is that they are a special character. You can use colons to search within specific fields (e.g. searching dc.title:test will find the word "test" in the "dc.title" field). See our docs on searching at https://wiki.lyrasis.org/display/DSDOC7x/Search+-+Advanced (which includes several example searches where a colon is a special character).

So, I'm not sure of a solution here but we definitely should add notes to the docs that searching text that includes a colon requires using quotes. I'll add it to that wiki page. (UPDATE: Added a note there about "Special Characters")

erpnedir commented 1 year ago

Dear @tdonohue I know and tested that it works when enclosed with double quotes. As a tech guy, I am very happy to use colon for metadata specific search. But what I observe among non-tech users is a bit different. Most of the librarian users, just copy the whole title text from an online source or system, and search with that. Academic research titles have colon commonly. So that leads to a confusion. I reported this bug to start a conversation if other people face the same problem. If so maybe we can think of a better solution. Best regards,

tdonohue commented 1 year ago

@erpnedir : I can understand the confusion, but I just wanted to point out why the colon is a special character. I don't have a great solution but have left this ticket open in case others do. I feel we don't want to lose the ability to search within specific metadata fields (e.g. dc.title:test), but I also understand that normal users may be discouraged if accidentally including a colon results in no results.

At the very least, it's possible we could add more helpful information to the search results. E.g. a "Did you mean.... " response which recommends adding quotes, or maybe just a response that says "No results returned. If your search includes special characters like colon or asterisk, try surrounding it with quotes". Other ideas are welcome here from anyone.