soilwise-he / Soilwise-Project-Backlog

This README file contains a guideline for using this github soilwise-he repository as a backlog for the Soilwise project
MIT License
0 stars 0 forks source link

4. identify duplicities #4

Open BerkvensNick opened 1 month ago

BerkvensNick commented 1 month ago

Thanks to a Interlinker component, that will be powered by SoilWise metadata store, the SWR will identify duplicities in data based on metadata. https://github.com/soilwise-he/Soilwise-userstories/issues/16


With more detailed tasks per requirement:

pvgenuchten commented 1 month ago

This issue needs to be discussed, duplicities will occur, a knowledge article will be available in both Zenodo, OpenAire and Cordis. However each of these platforms capture extra information about the resource. The information should be merged to a single set of statements about the resource. The knowledge graph will facilitate this process. In the process we will find multiple challenges, for example if a resource has different titles in different platforms. Typical behaviour is that both titles are stored.

BerkvensNick commented 1 month ago

Maybe in this first iteration we can identify/flag duplicates based on doi - title similarity - author - date and then further discuss with JRC how to tackle the duplicities, but then based on actual "duplicate sources" we have found?