pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
407 stars 127 forks source link

Citing papers list does not match with the citedby_count #277

Closed AndreaUnige closed 1 year ago

AndreaUnige commented 1 year ago

pybliometrics version: 3.4.0

The citing papers list does not match with the _citedbycount.

More precisely, if I run the code reported below I should get the list of all the citing papers for the paper identified with eid='2-s2.0-85084808560', which is actually a list of 25 articles. However, if the Scopus website is considered the results are different: the paper with eid='2-s2.0-85084808560' has 32 citations.

Namely, the list returned by my code only contains the first 25 papers, so the last 7 papers are missing.

This strange behavior occurs with several other papers while with many other the code runs smoothly meaning that list of citing papers is exactly the same with the citing papers present on the Scopus website.

Code to reproduce the bug:

response = requests.get('https://api.elsevier.com/content/search/scopus',
                            params=f'apikey=MY_SECRET_KEY&query=refeid('2-s2.0-85084808560')',
                            timeout=30)
response = response.json()['search-results']['entry']

response is a list of 25 elements. It should contain the citing papers (i.e., the papers which cite the paper with eid='2-s2.0-85084808560'). However, if you consider the Scopus platform the paper with eid='2-s2.0-85084808560' has Direct link: https://www.scopus.com/record/display.uri?eid=2-s2.0-85084808560&origin=resultslist&sort=plf-f&src=s&sid=4d7bac0cb873a9e6a826cbe36b09781b&sot=aut&sdt=a&sl=17&s=AU-ID%288546192900%29&relpos=18&citeCnt=32&searchTerm=

Expected behavior: I expected a list of 32 elements which contains all the citing papers.

AndreaUnige commented 1 year ago

Just did some further study. It seems that 25 is the number of items per page.

So the question becomes, how to get the results from the next page ?

Michael-E-Rose commented 1 year ago

You say you use pybliometrics but I wonder you the code you're provide is no pybliometrics class. To answer your questionn, "hot to get the results from the next page", the answer is: Use pybliometrics

pybliometrics simplifies "walking" through Scopus results. Like so:

from pybliometrics.scopus import ScopusSearch

q = "REF(2-s2.0-85084808560)"

s = ScopusSearch(q)
print(s)

Output:

Search 'REF(2-s2.0-85084808560)' yielded 32 documents as of 2023-02-10:
    2-s2.0-85145874372
    2-s2.0-85140996430
    2-s2.0-85139502803
    2-s2.0-85130679706
    2-s2.0-85124207749
    2-s2.0-85123274495
    2-s2.0-85130196652
    2-s2.0-85120890713
    2-s2.0-85118545737
    2-s2.0-85146247035
    2-s2.0-85135927840
    2-s2.0-85135218350
    2-s2.0-85130602960
    2-s2.0-85130591267
    2-s2.0-85122858958
    2-s2.0-85101450906
    2-s2.0-85097952145
    2-s2.0-85118261941
    2-s2.0-85118253765
    2-s2.0-85116307583
    2-s2.0-85107224660
    2-s2.0-85105431689
    2-s2.0-85101925769
    2-s2.0-85097963222
    2-s2.0-85126734120
    2-s2.0-85124373050
    2-s2.0-85124356071
    2-s2.0-85123350538
    2-s2.0-85118250325
    2-s2.0-85107168853
    2-s2.0-85106976784
    2-s2.0-85105467352
Michael-E-Rose commented 1 year ago

PS: Please next time ask this kind of question (how to do xyz?) on StackOverflow.