department-of-reuse / DoR

The Department of Reuse tracks and documents reuse of artifacts in computer science (starting with the SE field)
https://www.reuse-dept.org
Creative Commons Attribution 4.0 International
14 stars 42 forks source link

Add Author ORCID Inference for CrossRef #322

Closed johannesduesing closed 3 years ago

johannesduesing commented 3 years ago

Reason for this PR As seen in #67 , our data often contains multiple author entries referencing the same physical person. Some of these Entries have more information attached to them than others. Examples include missing information regarding Affiliation and ORCID.

Changes in this PR This PR extends the CrossRef prefill-cache step by accumulating information from different Author objects representing the same physical person. A combination of given, family and name is used to uniquely identify authors, meaning that Author objects with the same name will be seen as representing the same physical person. For those objects, the affiliation entries will be accumulated. An oRCID entry will be set to the accumulated object if it is found in any of the matching Author objects.

The same logic is applied on-the-fly by adding a new AuthorsCache to the WorksCache. On every WorksCache.set it will update the accumulated author information from the new publication, and on every WorksCache.get it will retrieve the latest accumulated author information and attach it to the returned Works object.

This PR also includes a freshly generated version of the works-cache.json file.

This PR also contains a (currently inactive) approach to adding ORCID links to author names in the ReuseMetrics.vue page. However, this feature has currently been deactivated because it breaks the text filter of the author-name column, which is afaik not acceptable behavior.