pybliometrics-dev / pybliometrics

Python-based API-Wrapper to access Scopus
https://pybliometrics.readthedocs.io/en/stable/
Other
417 stars 129 forks source link

Discrepancies between affiliation information showed on Scopus and the ones returned by pybliometrics #152

Closed raffaem closed 4 years ago

raffaem commented 4 years ago

Maybe this is related to #151

When I try to run the following query:

PUBYEAR IS 2010 AND AFFILCOUNTRY(Sweden) AND NOT SUBJAREA(ARTS OR BUSI OR DECI OR ECON OR PSYC OR SOCI)

through pybliometrics.scopus.ScopusSearch, one of the results is the following (notice that the namedtuple returned by the .results attribute has been converted into a dictionary):

{
        "affiliation_city": "Delft;Siena;Göteborg",
        "affiliation_country": "Netherlands;Italy;Sweden",
        "affilname": "Delft University of Technology;Università degli Studi di Siena;Chalmers University of Technology",
        "afid": "60006288;60002838;60000990",
        "aggregationType": "Conference Proceeding",
        "article_number": "5613090",
        "authkeywords": null,
        "author_afids": "60002838;60000990;109703819;109703819;;60006288;60002838",
        "author_count": "7",
        "author_ids": "24468923000;7102278302;6506832196;56233097100;56960044800;55451141300;7006417432",
        "author_names": "Puzović, Nikola;McKee, Sally A.;Eres, Revital;Zaks, Ayal;Gai, Paolo;Wong, Stephan;Giorgi, Roberto",
        "citedby_count": "6",
        "coverDate": "2010-12-13",
        "coverDisplayDate": "2010",
        "creator": "Puzović N.",
        "description": "Understanding the behavior of current and future workloads is key for designers of future computer systems. If target workload characteristics are available, computer designers can use this information to optimize the system. This can lead to a chicken-and-egg problem: how does one characterize application behavior for an architecture that is a moving target and for which sophisticated modeling tools do not yet exist? We present a multi-pronged approach to benchmark characterization early in the design cycle. We collect statistics from multiple sources and combine them to create a comprehensive view of application behavior. We assume a fixed part of the system (service core) and a \"to-be-designed\" part that will gradually be developed under the measurements taken on the fixed part. Data are collected from measurements taken on existing hardware and statistics are obtained via emulation tools. These are supplemented with statistics extracted from traces and ILP information generated by the compiler. Although the motivation for this work is the classification of workloads for an embedded, reconfigurable, parallel architecture, the methodology can easily be adapted to other platforms. © 2010 IEEE.",
        "doi": "10.1109/CLUSTERWKSP.2010.5613090",
        "eIssn": null,
        "eid": "2-s2.0-78649899217",
        "fund_acr": null,
        "fund_no": "undefined",
        "fund_sponsor": null,
        "issn": null,
        "issueIdentifier": null,
        "openaccess": "0",
        "pageRange": null,
        "pii": null,
        "publicationName": "2010 IEEE International Conference on Cluster Computing Workshops and Posters, Cluster Workshops 2010",
        "pubmed_id": null,
        "source_id": "19700182719",
        "subtype": "cp",
        "subtypeDescription": "Conference Paper",
        "title": "A multi-pronged approach to benchmark characterization",
        "volume": null
    }

Notice that the 3rd and 4th authors both have an affiliation ID equal to "109703819", and this affiliation ID does not appear in the list of affiliation IDs (the field "afid"). Notice also that the 5th author has a missing (empty) affiliation ID.

But if you look up the document in SCOPUS, you will discover that the 3rd and 4th authors have affiliation information, in particular they are affiliated with "IBM Haifa Labs., Haifa, IL, United States".

Also the 5th author, the one for which the results returned by pybliometrics showed no information, in the webpage of the article in SCOPUS is affiliated with "Evidence S.r.l., Pisa, Italy". So also the 5th author has an affiliation in SCOPUS

Michael-E-Rose commented 4 years ago

There are multiple things here: