semantic-systems / nfdi-search-engine

A lightweight, KG-driven search engine over different endpoints and APIs
https://nfdi-search.nliwod.org/
MIT License
5 stars 3 forks source link

Deduplication of researchers #141

Open Meisenburger13 opened 11 months ago

janreineke commented 11 months ago

Hi, Is it still needed to get in contact with guys from the Uni Paderborn for the Deduplication? Or are you in contact with Fabian Pause?

abdullah-rana commented 11 months ago

Thanks Jan. We couldn't get hold of Fabian last week while he was here. We now have to write to him to ask if he has got any such utility/code which can be used for deduplication. In parallel, let's initiate the discussion with the guys from Paderborn university. The more, the merrier. Then, whichever utility we get first or is more optimal, we can embed in our solution.

janreineke commented 11 months ago

Ok. I found out, that the guy (Adrian) I wanted to speak with, left the DICE-group this year. I will try to find another expert there. By the way. In Coypu we have a task "Event deduplication" where Junbo is in charge of. Let's talk to him, too.

abdullah-rana commented 11 months ago

Thanks again. We discussed this with Junbo almost a week back. We explained our scenario and asked him how to apply deduplication/entity resolution in our solution. He suggested starting with rule-based record duplication filtering/merging to obtain some baseline (so to speak), and later we can incorporate any entity resolution plugin, if found, and compare its output with the baseline results.