semantic-systems / nfdi-search-engine

A lightweight, KG-driven search engine over different endpoints and APIs
https://nfdi-search.nliwod.org/
MIT License
5 stars 3 forks source link

Feature/deduplicate articles #137

Closed huntila closed 10 months ago

huntila commented 11 months ago

This branch contains the implementation of the deduplication The function deduplicate_search_results() in merger.py receive the final search result from main.py. Currently only takes publications, sets the similarity threshold, vectorizes each publication using the title and authors, and finally returns publications by comparing their similarity score with the threshold value.