Closed fionatiefenbacher closed 1 year ago
For the redis evaluation (https://github.com/FHNW-IVGI/Geoharvester/issues/9) I`ll follow along this guide: https://redis.com/blog/redismart-real-time-json-product-catalog-service/
Note how they build the fuzzy search / suggestion feature around predefined categories. While we probably also want to index on the tfidf score (among other obvious fields like e.g. Kantons) we might want to think along this route as well:
@FStriewski I would suggest to merge this branch to the main for the preprocessing part. All the implemented functions are contained in a separate folder under utils.py and should not be in conflict with the main branch.
Branch merged to the main in order to exploit its functions.
In order to rank the search output, natural language processing is necessary. Various Python libraries such as NLTK and algorithms like TFIDF can for example create a relevance matrix from a text.