pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
410 stars 128 forks source link

[BUG] Citations end before they should #208

Closed raffaem closed 2 years ago

raffaem commented 3 years ago

In the following MWE, the requested end year is 2021, but the citations are downloaded only up until 2020

MWE:

from pybliometrics.scopus import CitationOverview
res = CitationOverview(identifier=['59849121259', '60349100050', '60449098796', '67949117113', '75149132848', '75949120545', '77950209904', '77955010108', '77957123617', '78650692942', '79952350232', '84856112748', '84856718629', '84861671006', '84861696683', '84864486999', '84864540133', '84866334464', '84873182821', '84873576457', '84879096100', '84880922726', '84885190755', '84887012392', '84896442897'], start=2007, end=2021)
years = [year for doc in res.cc for year, citn in doc]
print(max(years))

Output:

$ python3 bug_less_than_end_year.py 
2020
Michael-E-Rose commented 3 years ago

I cannot replicate this. My output is 2021.

Does your output change if you put refresh=True?

raffaem commented 2 years ago

Yes if I put refresh=True my output changes and I obtain 2021

Michael-E-Rose commented 2 years ago

Such things might occur e.g. when you parsed this information on an earlier date with the same parameters, and when in this older state of Scopus there were no citations in 2021. Since pybliometrics uses the paramters to generate the file name of the cached data, you were looking at the old state.