Closed larsvilhuber closed 1 year ago
Hi @larsvilhuber. If the most accurate author careers is a high priority, you might want to try DBLP which is author curated. You can also take the subset of orcid - labeled authors from Open Alex. Otherwise, all other bibliometric databases are subject to author disambiguation errors. We dont have plans to implement an author disambiguation pipeline in pyscisci, but any dataset can be complemented with the user's own publication-author assignments. See, for example, the APS getting started notebook (https://github.com/SciSciCollective/pyscisci/blob/master/examples/Getting_Started/Getting%20Started%20with%20APS.ipynb).
At least for Open Alex, I have found their author identity algorithm to have some problems (identifying unique authors = entity resolution). Any thoughts of discarding a data source's unique author id and construct your own via a plug-in?