soilwise-he / similarity-finder

a component which finds similarities in a set of statements about a resource
MIT License
0 stars 0 forks source link

what is a duplicate? #4

Open BerkvensNick opened 2 months ago

BerkvensNick commented 2 months ago

Duplicates can be classified through:

Content differences can be classified as

Differences between duplicates can be explained by

In case of conflict it is important to go back to the point-of-truth, the originating platform, to capture the latest situation. In case of conflicting statements, it is important to store both statements.

Definition of Done

roblokers commented 1 month ago

Additionally, we need to decide on how we implement a "minimal implementation" for the 1st iteration demonstration

pvgenuchten commented 1 month ago

For the first iteration, we focus on academic resources. The risk of conflicts is low, due to proper identification with DOI. I expect more conflicts when we harvest government sources from inspire / open data.

BerkvensNick commented 1 month ago

discuss in this sprint 3 with relational database in place with new setup, implementation in sprint 4