pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
420 stars 129 forks source link

Gathering funding metadata does not seem to work #209

Closed astrochun closed 3 years ago

astrochun commented 3 years ago

Bug report? Please state your pybliometrics version and a complete code snippet to reproduce the bug.

$ pip list
...
pybliometrics      3.0.1

I'm retrieving funding metadata from Scopus and I'm using your well-documented software. I get a 200 response; however, the funding metadata is empty. Here's my code to reproduce:

from pybliometrics.scopus import AbstractRetrieval
ab = AbstractRetrieval('10.1016/j.physleta.2020.126979')

ab.title
# 'A class of solutions of the two-dimensional Toda lattice equation'

ab.funding
# NoneType returned
ab.funding_text
# NoneType returned

I'm using the correct apiToken. In fact, I used Elsevier Interactive API tool: https://dev.elsevier.com/scopus.html#/Affiliation_Search and got the following (snippet for brevity):

{
  "abstracts-retrieval-response": {
    "item": {
      "ait:process-info": {
        "ait:status": {
          "@state": "new",
          "@type": "core",
          "@stage": "S300"
        },
        "ait:date-delivered": {
          "@day": "01",
          "@timestamp": "2020-11-01T22:32:01.000001-05:00",
          "@year": "2020",
          "@month": "11"
        },
        "ait:date-sort": {
          "@day": "07",
          "@year": "2021",
          "@month": "01"
        }
      },
      "xocs:meta": {
        "xocs:funding-list": {
          "@pui-match": "primary",
          "@has-funding-info": "1",
          "xocs:funding": {
            "xocs:funding-agency-matched-string": "US Department of Energy",
            "xocs:funding-agency-acronym": "USDOE",
            "xocs:funding-agency": "U.S. Department of Energy",
            "xocs:funding-id": "DE-AC02-09CH11466",
            "xocs:funding-agency-id": "http://data.elsevier.com/vocabulary/SciValFunders/100000015",
            "xocs:funding-agency-country": "http://sws.geonames.org/6252001/"
          },
          "xocs:funding-addon-generated-timestamp": "2021-05-13T20:50:55.149625Z",
          "xocs:funding-text": "Discussions with V.L. Quito are appreciated. This work was supported by the US Department of Energy under contract DE-AC02-09CH11466 .",
          "xocs:funding-addon-type": "http://vtw.elsevier.com/data/voc/AddOnTypes/50.7/aggregated-refined"
        }
      },
     ...

I can't spot the exact issue as it seems that you are using the correct metadata fields for chained_get.

Thanks in advance.

Michael-E-Rose commented 3 years ago

Try using view="FULL".

The abstract retrieval API provides multiple views, but they are not well documented: https://dev.elsevier.com/sc_abstract_retrieval_views.html.

In general, always go for the FULL view. I'm theory, this view is restricted to some users (according to the documentation), but I never met someone who couldn't access this view.

astrochun commented 3 years ago

Yes, that fixed the issue. Thanks for the extended docs. I saw other Scopus Search views but managed to missed the abstract one.

A related and perhaps this is a new issue for feature, but any chance to include the xors:funding-id metadata in Funding? I can create the issue. If you think this is straightforward, I can even create the PR.

Michael-E-Rose commented 3 years ago

The funding ID is part of the namedtuple in the list object funding. But the agency ID is not implemented. If you wanna provide a PR, that would be really great!

astrochun commented 3 years ago

The funding ID is part of the namedtuple in the list object funding. But the agency ID is not implemented. If you wanna provide a PR, that would be really great!

@Michael-E-Rose, I'm a bit confused. I think there are different IDs involved. There is an agency ID and a "grant" id, which is provided as "funding-id" by Scopus. The Funding object has the following attributes: agency, string, id, acronym, and country. For example:

Funding(
    agency='U.S. Department of Energy',
    string='US Department of Energy',
    id='http://data.elsevier.com/vocabulary/SciValFunders/100000015',
    acronym='USDOE',
    country='http://sws.geonames.org/6252001/'
)

I'm after the grant nos, so: "DE-AC02-09CH11466".

I can create the issue and PR for it though before I start, what should I use for the attribute since id is taken. Would award_nos be good? I had not looked at whether there are multiple entries in which case it may be desired to be a list element. Thoughts?

Michael-E-Rose commented 3 years ago

Closed via #210 and corresponding PR #211.