SciSciCollective / pyscisci

Science of Science
MIT License
165 stars 22 forks source link

For career oriented metrics, any hope of adding plug-in entity resolution algorithms? #15

Closed larsvilhuber closed 1 year ago

larsvilhuber commented 1 year ago

At least for Open Alex, I have found their author identity algorithm to have some problems (identifying unique authors = entity resolution). Any thoughts of discarding a data source's unique author id and construct your own via a plug-in?

ajgates42 commented 1 year ago

Hi @larsvilhuber. If the most accurate author careers is a high priority, you might want to try DBLP which is author curated. You can also take the subset of orcid - labeled authors from Open Alex. Otherwise, all other bibliometric databases are subject to author disambiguation errors. We dont have plans to implement an author disambiguation pipeline in pyscisci, but any dataset can be complemented with the user's own publication-author assignments. See, for example, the APS getting started notebook (https://github.com/SciSciCollective/pyscisci/blob/master/examples/Getting_Started/Getting%20Started%20with%20APS.ipynb).