pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
422 stars 129 forks source link

Discrepancy between ScopusAuthor and ScopusAbstract #63

Closed capemaster closed 6 years ago

capemaster commented 6 years ago

There is a discrepancy in the data retrieved between these lines:

from scopus import ScopusAuthor, ScopusSearch, ScopusAbstract
author= "24340839100"
auDetails = ScopusAuthor(author, refresh=True)
print(auDetails.ndocuments)
# n=43

and these

s = ScopusSearch('AU-ID(' + str(author) + ')', refresh=True)
print(len(s.EIDS))
# n= 45

Why? Is this an issue on the API itself or in the function? I have noticed that not all authors are affected by this. The second number is the one displayed by Scopus official Website.

Michael-E-Rose commented 6 years ago

Hi Capemaster! Thanks for the report, very interesting. I looked into the numbers. The discrepancy is coming from the API. It's however important to note that API and Scopus Web View have different underlying data. That means, the second number is the one relevant to Scopus Web View, but the first number is displayed in the API: https://api.elsevier.com/content/author/author_id/24340839100?APIKey=ADDYOURKEYHERE.

Author 24340839100 authored 41 articles, 2 conference papers, 1 book chapter and 1 note. You'll see that if you list the author's publication as search results and then on the left pane look at the distribution within document types.

In the document-count, which is what .ndocuments is referring to, they apparently exclude notes and book chapters in the document count.

I will update the documentation to reflect that information.