SSHOC / sshoc-marketplace-backend

Code for the backend
Apache License 2.0
2 stars 0 forks source link

Duplicate check for actors/externalIds #483

Open mkrzmr opened 2 weeks ago

mkrzmr commented 2 weeks ago

While working on an ingest script, I noticed actors can have duplicate externalIds, for example this actor has duplicate entries for GitHub and Twitter:

Each entry will also be shown in the frontend, creating a lot of identical links

I am going to change my script to avoid creating duplicate entries, but it is something to keep in mind - what is the estimate to add a check to this endpoint to avoid duplicate externIds?

{ "id": 3020, "name": "Adam Crymble", "externalIds": [ { "identifierService": { "code": "GitHub", "label": "GitHub", "ord": 4, "urlTemplate": "https://github.com/{source-actor-id}" }, "identifier": "https://github.com/acrymble" }, { "identifierService": { "code": "Twitter", "label": "Twitter", "ord": 5, "urlTemplate": "https://twitter.com/{source-actor-id}" }, "identifier": "https://twitter.com/Adam_Crymble" }, { "identifierService": { "code": "SourceActorId", "label": "Source ActorId", "ord": 7, "urlTemplate": "" }, "identifier": "2-6abc40a103ee9ea16470df510dac897774393bdb2b54fbb066acd0e6e4fc4a07" }, { "identifierService": { "code": "GitHub", "label": "GitHub", "ord": 4, "urlTemplate": "https://github.com/{source-actor-id}" }, "identifier": "acrymble" }, { "identifierService": { "code": "Twitter", "label": "Twitter", "ord": 5, "urlTemplate": "https://twitter.com/{source-actor-id}" }, "identifier": "Adam_Crymble" }, { "identifierService": { "code": "SourceActorId", "label": "Source ActorId", "ord": 7, "urlTemplate": "" }, "identifier": "2-e054591b9e982532467ffc1613578601290b611a81808681e26eaec3748e4093" }, { "identifierService": { "code": "DBLP", "label": "dblp", "ord": 2, "urlTemplate": "https://dblp.org/pid/{source-actor-id}" }, "identifier": "145/8079" }, { "identifierService": { "code": "SourceActorId", "label": "Source ActorId", "ord": 7, "urlTemplate": "" }, "identifier": "103-145/8079" } ], "website": "http://adamcrymble.org", "affiliations": [ { "id": 2902, "name": "University College London", "externalIds": [ { "identifierService": { "code": "SourceActorId", "label": "Source ActorId", "ord": 7, "urlTemplate": "" }, "identifier": "2-861604c6886ab122aa147fb1cab83479b93a509265663a733e19bbd27ab29181" }, { "identifierService": { "code": "ROR", "label": "ROR", "ord": 6, "urlTemplate": "https://ror.org/{source-actor-id}" }, "identifier": "02jx3x895" } ], "affiliations": [] } ] }

mkrzmr commented 2 weeks ago

On hold, looking for better example