FHNW-IVGI / Geoharvester

NDGI Project Geoharvester
10 stars 1 forks source link

[Backend] Search checks for exact matches, should be less strict to find similar results #28

Closed p1d1d1 closed 1 year ago

p1d1d1 commented 1 year ago

We need to improve the search. Currently, e.g., if you search for "swissboundaries" you only get 2 entries and datasets containing "swissboundaries3d" are not returned. Another example: if you search for "grenzen" you don't get the datasets from Bund (like Landesgrenzen, Gemeindegranzen, etc.).

We need to be able to provide the same results (if not better) as https://davidoesch.github.io/geoservice_harvester_poc/. I guess we can achieve this if the search go also through the abstract and is additionally able to recognize parts in composed words.

FStriewski commented 1 year ago

The issue described might be related to stemming. @eliaferrari is that something you could check? Maybe we have to scale back stemming :/

eliaferrari commented 1 year ago

We could use stemming on the query, but I think the issue is more complex than just that. The extended search will require additional database columns in the database (I've already implemented some functions in #11 ) as well as a ranking method/function, which can be derived from different search methods.

FStriewski commented 1 year ago

We found out that the issue is not due to stemming but due to redis checking for an exact match. Instead we would need a LIKE or CONTAINS operation, if that is possible with redis. Needs investigation

eliaferrari commented 1 year ago

The search function in redis is now including as many results as possible, which are subsequently postprocessed with a ranking function, sorted and finally returned to the API with additional ranking information.