pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
414 stars 129 forks source link

CitationOverview's cc property unexpected Error. #322

Closed kosh-jp closed 6 months ago

kosh-jp commented 7 months ago

pybliometrics version: 3.6

Code to reproduce the bug: Errors are occurring in specific doi. 10.11604/pamj.2015.22.35.6737

It is similar to this issue. https://github.com/pybliometrics-dev/pybliometrics/issues/301

code

doi = "10.11604/pamj.2015.22.35.6737"

co = CitationOverview(
    [doi],
    2013,
    id_type="doi",
    APIKey=SCOPUS_API_KEY,
    InstToken=SCOPUS_INST_TOKEN,
)

print(co.cc[0])

result

...
  File "/mydir/env/lib/python3.10/site-packages/pybliometrics/scopus/abstract_citation.py", line 48, in cc
    cites = [int(d["$"]) for d in doc["cc"]]
KeyError: 'cc'

Maybe, cc is not return from scopus API.

'_citeInfoMatrix': [
    {
        "@_fa": "true",
        "identifier": "SCOPUS_ID:84973517467",
        "url": "https://api.elsevier.com/content/abstract/scopus_id/84973517467",
        "pcc": "0",
        "lcc": "0",
        "rangeCount": "0",
        "rowTotal": "0"
    }
]
kosh-jp commented 7 months ago

I have created my PR based on your previous modifications. If you could change it I would appreciate it.

Michael-E-Rose commented 7 months ago

According to the Scopus search, this DOI belongs to EID 2-s2.0-84973517467 - but it apparently doesn't exist! See here: https://www.scopus.com/record/display.uri?eid=2-s2.0-84973517467&origin=resultslist

So the Scopus API should actually return 404. It's a problem in the Scopus database, which should be reported.

I value tremendously when people come up with a PR, but in this case I am hesitant to accept it. The PR would cover an apparent mistake in the Scopus database.

kosh-jp commented 6 months ago

I understand that PR is not appropriate.

Thank you for your reply and response!!

astrochun commented 6 months ago

I have a similar issue with another record, 85185347473. However, upon reviewing it, it seems that the record exists:

{'@_fa': 'true', 'identifier': 'SCOPUS_ID:85185347473', 'url': 'https://api.elsevier.com/content/abstract/scopus_id/85185347473', 'pcc': '0', 'lcc': '0', 'rangeCount': '0', 'rowTotal': '0'}
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
...
     44 try:
---> 45     cites = [int(d['$']) for d in doc['cc']]
     46 except AttributeError:  # No citations
     47     cites = [0]*len(_years)

KeyError: 'cc'

https://www.scopus.com/record/display.uri?eid=2-s2.0-85185347473&origin=resultslist

Michael-E-Rose commented 6 months ago

Interesting case: Scopus.com says, the article actually was cited. I wonder whether this is a bug in scopus? Because the other keys in the citation matrix are present, and 0. Anyways, going over these kind of inconsistencies and irregularities in the Scopus API is the job of pybliometrics ^^

Michael-E-Rose commented 6 months ago

Let's hope this solves it

astrochun commented 6 months ago

Let's hope this solves it

Thanks. I'll wait for a new PyPI release to pull and try it out.