Open eloiferrer opened 1 year ago
The ORCID for all the zbmath authors in https://zenodo.org/records/7378860 have been inserted.
Current statistics in the KG:
Humans = 1178288
Humans with zbmath ID = 1117009
Humans with ORCID ID = 40109
Humans with zbmath ID and Wikidata = 40753
Humans with zbmath ID and ORCID = 32619
Humans with arXiv author ID = 127
Next step: get Wikidata QID for as many humans as possible:
Given the zbMath ID I have matched them to items available in Wikidata. Only ~5% of the zbMath authors exist in Wikidata (with the zbmath identifier). For those where an ORCID was present, it has also been imported.
Current statistics:
I've imported further Wikidata QIDs given the current ORCID in the KG. I've also merge several authors that had the same ORCID ID.
Current statistics:
Wikidata has author items that contain two zbMath IDs. For most of the cases this is wrong, which leads to our knowledge graph having the same Wikidata QID for two different zbmath authors. See cases here: http://tinyurl.com/27d65qov This would require some manual disambiguation.
Issue description: The current importers (CRAN, zbMath, polyDB) create entities for authors using ORCID ID, zbMath ID or no identifier. For the cases in which an identifier exists, authors might have been created more than once by different importers. Duplicate authors should be identified, merged and completed with information from Wikidata. The dataset mentioned here (https://github.com/MaRDI4NFDI/portal-compose/issues/344) can be useful for the task.
TODOS:
Acceptance-Criteria
Checklist for this issue: