SSHOC / sshoc-marketplace-backend

Code for the backend
Apache License 2.0
2 stars 0 forks source link

Improve handling of special characters in the search (results) #428

Closed laureD19 closed 6 months ago

laureD19 commented 7 months ago

Since the addition of the contributors in the item-search endpoint, several use cases appeared that should be investigated to provide more pertinent results for end-users. Especially handling of special characters in the search should be improved to deliver better results.

Example 1. Searching for "DARIAH-PT / ROSSIO" (which is an actor) deliver relevant results when quotation marks are added, but no results at all without the quotation marks.

Example 2. Searching for “DARIAH-DE”, with or without quotation marks, doesn’t rank well items where DARIAH-DE is present as actor or even as string in the label of items. Other items, not mentioning DARIAH-DE at all are better ranked.

notify @mkrzmr @KlausIllmayer @vronk

tparkola commented 6 months ago

In case of example 1: the "/" character is not taken into consideration in the context of internal backend search mechanism, so I will make a small change to improve this situation. It should work for other similar cases. There is also an issue related to the second example, described below. In case of example 2: in the search query there is a special parameter (boost) introduced that makes results higher in the ranking based on the number of related items. This parameter has very high influence on the ranking, so this is why we have this "wierd" behaviour. I would suggest to limit the influence of this boost parameter, so that it only has minor influence. All my suggested changes are in the pull request - please thoroughly check if the resulting behaviour meets your expectations, as the changes are significant. We can also organise a separate meeting to discuss that, if something is not what you expect.