FHNW-IVGI / Geoharvester

NDGI Project Geoharvester
10 stars 1 forks source link

[NLP, Backend] Issue with searching similar words #103

Closed FStriewski closed 1 month ago

FStriewski commented 1 month ago

There is an issue with concurrent searches of similar words. Try e.g. "Wasser" and "Wasserfall". The first will return results, the second will return "No results". If you search for "Wasserfall" from a different search term, data is found.

@eliaferrari maybe an issue with stemming / keywords? Could you give it a look?

eliaferrari commented 1 month ago

@FStriewski The problem was in the conversion from JSON to binary to rank the results. I've updated the cleaning pipeline, now it works but could be other special cases like this one. I've noted it in the json_to_pandas function for the ranking: https://github.com/FHNW-IVGI/Geoharvester/blob/bec025ed93d55e6c54d3727725faa8cdb24bea56/server/app/redis/methods.py#L264