waldronlab / curatedMetagenomicDataCuration

Sample Metadata Curation for curatedMetagenomicData
https://waldronlab.io/curatedMetagenomicDataCuration/
28 stars 24 forks source link

PubMed Class Specification #25

Closed schifferl closed 3 years ago

schifferl commented 6 years ago

Overview

Write a method to obtain citation information from PubMed using the RISmed package. The method, PubMed, should return a list of three data.frame objects containing citation information. Provided with only a PMID the method should return the tables shown below.

An example of what a call to the PubMed method should look like:

pasolli <- PubMed(PMID = "29088129")

Components

Journal Information

The first of the three data.frame objects returned in the list should, at the minimum, contain the following fields.

PMID title journal volume pages day month year DOI abstract
29088129 Accessible, curated... Nature Methods 14 1023 EP 31 10 2017 10.1038/nmeth.4468 ...

Author Information

The second of the three data.frame objects returned in the list should provide details about the authors of the publication. Specifically, an order field should be provided to denote the order of authorship as it appears in the publication.

order first_name last_name middle_initial
1 Edoardo Pasolli
2 Lucas Schiffer
3 ... ... ...

Affiliation Information

The third of the three data.frame objects returned in the list should provide details about author affiliations. These affiliations should be related to the author information by the order field.

order affiliation
1 Centre for Integrative Biology, University of Trento, Trento, Italy
2 Graduate School of Public Health and Health Policy, City University of New York, New York, New York, USA
3 Institute for Implementation Science and Population Health, City University of New York, New York, New York, USA
4 ...

Show Method

When a PubMed class object is called from the R console, as follows, it should have a show method to display the relevant details of the object in an orderly fashion.

pasolli

For example, the call above should produce the output that follows. As seen, the output should be limited to line lengths of 80 or fewer characters and display the information in logical order.

## class: PubMed
##
## Accessible, curated metagenomic data through experimenthub. Nature Methods, 
## 14:1023 EP, 10 2017.
##
## E. Pasolli^1, L. Schiffer^2,3, P. Manghi^1, A. Renson^2,3, V. Obenchain,
## D. T. Truong, F. Beghini, F. Malik, M. Ramos, J. B. Dowd, C. Huttenhower,
## M. Morgan, N. Segata, and L. Waldron.
##
## 1. Centre for Integrative Biology, University of Trento, Trento, Italy
## 2. Graduate School of Public Health and Health Policy, City University of
##    New York, New York, New York, USA
## 3. Institute for Implementation Science and Population Health, City
##    University of New York, New York, New York, USA
##
## Further affiliations are available. Use the 'affiliation' method.
##
## We present curatedMetagenomicData, a Bioconductor and command-line interface
## to thousands of metagenomic profiles from the Human Microbiome Project and
## other publicly available datasets, and ExperimentHub, a platform for
## convenient cloud-based distribution of data to the R desktop. The resource
## provides standardized per-participant metadata linked to bacterial, fungal,
## archaeal, and viral taxonomic abundances, as well as quantitative metabolic
## functional profiles. The datasets can be immediately analyzed in R or other
## software with a minimum of bioinformatic expertise and no preprocessing of
## data. We demonstrate identification of taxonomic/functional correlations, an
## investigation of gut "enterotypes", and a comparison of the accuracy of
## disease classification from different data types. These documented analyses
## can be reproduced efficiently on a laptop, without the barriers of working
## with large-scale, raw sequencing data. The building and expansion of
## curatedMetagenomicData is based entirely on open source software and
## pipelines, to facilitate the addition of new microbiome datasets and methods.
##
cmirzayi commented 4 years ago

I'm turning my attention to this next.

cmirzayi commented 4 years ago

Added in https://github.com/waldronlab/curatedMetagenomicDataCuration/commit/fc2a4d71b3a6fbf01099de7acc4482cb1239d831

PubMed() and PubMed class S4 objects are now added. There are some outstanding issues to be resolved:

  1. Associating affiliations with authors seems to be difficult if not impossible using RISmed. It seems to just return a list of all affiliations with no easy way to associate them with a given author.
  2. Many articles have multiple dates--epub date, pub date, date available on PubMed. We will need to choose one of these dates (or return all of them).
lwaldron commented 3 years ago

Works well enough, closing.