muschellij2 / rscopus

Scopus Database API Interface to R
75 stars 16 forks source link

retrieve email address of corresponding author #42

Closed indradecastro closed 3 years ago

indradecastro commented 3 years ago

Thank you very much for your rscopus package, it's been very useful so far!

However, I am still struggling using it to retrieve corresponding authors email address.

This is one of the fields offered by Scopus within the bibliographical info and inside the field "correspondence address". When using the Scopus web interface I can download this correspondence addresses in a BibTex file. Then opening the BibTex file it appears as a non-bibtex field named "correspondence_address1" (see below an example of one of my publications), which is perfectly suitable for me :)

@ARTICLE{deCastro-Arrazola2018,
correspondence_address1={deCastro-Arrazola, I.; Department of Biogeography and Global Change, Spain; email: indra@mncn.csic.es},
}

However, using function affiliation_retrieval doesn't seem to help :( Any idea of how to retrieve this email from within rscopus? is it an intended Elsevier limitation?

Thank you very very much in advance,

muschellij2 commented 3 years ago

Please provide a minimal example

On Thu, Sep 23, 2021 at 10:41 AM indradecastro @.***> wrote:

Thank you very much for your rscopus package, it's been very useful so far!

However, I am still struggling using it to retrieve corresponding authors email address.

This is one of the fields offered by Scopus within the bibliographical info and inside the field "correspondence address". When using the Scopus web interface I can download this correspondence addresses in a BibTex file. Then opening the BibTex file it appears as a non-bibtex field named "correspondence_address1" (see below an example of one of my publications), which is perfectly suitable for me :)

@ARTICLE{deCastro-Arrazola2018, correspondence_address1={deCastro-Arrazola, I.; Department of Biogeography and Global Change, Spain; email: @.***}, }

However, using function affiliation_retrieval doesn't seem to help :( Any idea of how to retrieve this email from within rscopus? is it an intended Elsevier limitation?

Thank you very very much in advance,

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/muschellij2/rscopus/issues/42, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIGPLXLMW2J4IJID4R2CWDUDNKCJANCNFSM5EUDGKBQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

-- Best, John

indradecastro commented 3 years ago

Hi John,

I do not understand what you mean with a minimum example in this case as I am not facing an error. But I will provide as much information as possible so you understand what I am trying to rerieve.

This is an example (OBTAINED directly from the Scopus web interface) of the details of an article in .bib format with many details, including a field called correspondence_address1 (last line, including email address at the end of the line):

Scopus
EXPORT DATE: 23 September 2021

@ARTICLE{deCastro-Arrazola2018,
author={deCastro-Arrazola, I. and Hortal, J. and Moretti, M. and Sánchez-Piñero, F.},
title={Spatial and temporal variations of aridity shape dung beetle assemblages towards the Sahara desert},
journal={PeerJ},
year={2018},
volume={2018},
number={9},
doi={10.7717/peerj.5210},
art_number={e5210},
url={https://www.scopus.com/inward/record.uri?eid=2-s2.0-85054621776&doi=10.7717%2fpeerj.5210&partnerID=40&md5=7b8c41a68fd579636d060882527672ea},
affiliation={Department of Biogeography and Global Change, Museo Nacional de Ciencias Naturales (MNCN-CSIC), Madrid, Spain; Departamento de Zoología, Facultad de Ciencias, Universidad de Granada, Granada, Spain; Department of Ecology, Instituto de Ciências Biologicas, Universidade Federal de Goiás, Goiânia, Brazil; Biodiversity and Conservation Biology, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland},
correspondence_address1={deCastro-Arrazola, I.; Department of Biogeography and Global Change, Spain; email: indra@mncn.csic.es},
}

This is my R code trying to retrieve the same data:

# SETTING UP
library(rscopus)
set_api_key("XXXXXX")

# QUERY SCOPUS
res <- scopus_search("DOI(10.7717/peerj.5210)", view="COMPLETE")
df <- gen_entries_to_df(res$entries, scrub = T)

abs <- abstract_retrieval(df$df$eid, "eid")

# TRANSFORM results into bib format
cat(bibtex_core_data(abs))

This is the results I obtain with the R code above, which is missing the "correspondence_address1":

@article{deCastro-Arrazola2018Spatialdesert,
author = {Indradatta deCastro-Arrazola and Joaquín Hortal and Marco Moretti and Francisco Sánchez-Piñero},
address = {Universidade Federal de Goiás;CSIC - Museo Nacional de Ciencias Naturales (MNCN);Universidad de Granada, Facultad de Ciencias;Eidgenössische Forschungsanstalt für Wald, Schnee und Landschaft WSL},
title = {Spatial and temporal variations of aridity shape dung beetle assemblages towards the Sahara desert},
journal = {PeerJ},
year = {2018},
volume = {2018},
number = {9},
pages = {-},
doi = {10.7717/peerj.5210}
abstract = {Copyright 2018 deCastro-Arrazola et al.Background: Assemblage responses to environmental gradients are key to understand the general principles behind the assembly and functioning of communities. The spatially and temporally uneven distribution of water availability in drylands creates strong aridity gradients. While the effects of spatial...}
muschellij2 commented 3 years ago

The bibtex_core_data function processes the list output from the Scopus API, so when you say "missing", it means it's not added into the output via the code (there is no correspondence to what Scopus BibTeX gives and rscopus).
Looking further, it doesn't seem like this in indicated in coredata of:

abs$content$`abstracts-retrieval-response`$coredata

If the information is in there, please send a pull request that changes https://github.com/muschellij2/rscopus/blob/master/R/bibtex_core_data.R.

muschellij2 commented 3 years ago

It seems to be located here (without the email):

x = abs$content$`abstracts-retrieval-response`
x$item$bibrecord$head$correspondence

Please send a Pull Request changing the code to incorporate this if you'd like.

indradecastro commented 3 years ago

Thanks for the effort.

However, the title of the issue clearly states that I would like to retrieve the email address. As the issue is not solved I would keep it open. Or re label it as an "enhancement".

If we discover that the Scopus API does not allow retrieving emails, this thread should clearly state it so that future readers do not try to use rscopus to retrieve emails with no luck.

anyway, thank you.