Open cstubben opened 8 years ago
The europepmc::epmc_details()
parses the resulttype=core
format. E.g., to get MeSH terms for more than one record, try
lapply(c("25730202", "25891958"), function(x) europepmc::epmc_details(x)$mesh_topic)
## [[1]]
## majorTopic_YN descriptorName
## 1 N Chlamydomonas reinhardtii
## 2 N Amino Acid Sequence
## 3 N Phenotype
## 4 Y Mutation
## 5 N Polymorphism, Single Nucleotide
## 6 N Genome
## 7 N Light
## 8 N High-Throughput Nucleotide Sequencing
##
## [[2]]
## majorTopic_YN descriptorName
## 1 N Vacuoles
## 2 N Plants, Genetically Modified
## 3 N Arabidopsis
## 4 N Petunia
## 5 N Seeds
## 6 N Proanthocyanidins
## 7 N Proton-Translocating ATPases
## 8 N Arabidopsis Proteins
## 9 N Genetic Complementation Test
## 10 N Gene Expression Regulation, Plant
## 11 N Biological Transport
## 12 N Mutation
## 13 N Adenosine Triphosphatases
The function uses the json output because I think it is easier to parse. In addition to MeSH, it returns:
Please let me know if I have missed something. I could try to support "raw" outputs in the upcoming version, so everyone could apply alternative parsers.
I don't want to include the core-format in the epmc_search
because this format is very nested and thus hard to parse. It would also require more memory.
That will work for a few articles, but I'd like MeSH terms for 100s of articles and downloading one at a time will take too long. I'd like the option to get raw output so users can create their own parsers.
Maybe change the id_list option to resulttype and include lite (default), idlist and core, and add a new format option (default is parsed except for core?, but you could return DC, JSON or XML)
epmc_search("title:Waddlia") # return data.frame
epmc_search("title:Waddlia", resulttype="core", format="xml")
Great. Will try to implement it for the upcoming version.
There is now an option that returns the core format in list form:
my_list <- epmc_search('Gabi-Kat', output = 'raw', limit = 10)
# display the structure for one list element
str(my_list[[10]])
#> List of 40
#> $ id : chr "27018849"
#> $ source : chr "MED"
#> $ pmid : chr "27018849"
#> $ pmcid : chr "PMC4883958"
#> $ doi : chr "10.1080/15592324.2016.1161876"
#> $ title : chr "Interaction between vitamin B6 metabolism, nitrogen metabolism and autoimmunity."
#> $ authorString : chr "Colinas M, Fitzpatrick TB."
#> $ authorList :List of 1
#> ..$ author:List of 2
#> .. ..$ :List of 6
#> .. .. ..$ fullName : chr "Colinas M"
#> .. .. ..$ firstName : chr "Maite"
#> .. .. ..$ lastName : chr "Colinas"
#> .. .. ..$ initials : chr "M"
#> .. .. ..$ authorId :List of 2
#> .. .. .. ..$ type : chr "ORCID"
#> .. .. .. ..$ value: chr "0000-0001-7053-2983"
#> .. .. ..$ affiliation: chr "a Department of Botany and Plant Biology , University of Geneva , Geneva , Switzerland."
#> .. ..$ :List of 5
#> .. .. ..$ fullName : chr "Fitzpatrick TB"
#> .. .. ..$ firstName : chr "Teresa B"
#> .. .. ..$ lastName : chr "Fitzpatrick"
#> .. .. ..$ initials : chr "TB"
#> .. .. ..$ affiliation: chr "a Department of Botany and Plant Biology , University of Geneva , Geneva , Switzerland."
#> $ authorIdList :List of 1
#> ..$ authorId:List of 1
#> .. ..$ :List of 2
#> .. .. ..$ type : chr "ORCID"
#> .. .. ..$ value: chr "0000-0001-7053-2983"
#> $ journalInfo :List of 8
#> ..$ issue : chr "4"
#> ..$ volume : chr "11"
#> ..$ journalIssueId : int 2439536
#> ..$ dateOfPublication : chr "2016 "
#> ..$ monthOfPublication : int 0
#> ..$ yearOfPublication : int 2016
#> ..$ printPublicationDate: chr "2016-01-01"
#> ..$ journal :List of 6
#> .. ..$ title : chr "Plant signaling & behavior"
#> .. ..$ medlineAbbreviation: chr "Plant Signal Behav"
#> .. ..$ isoabbreviation : chr "Plant Signal Behav"
#> .. ..$ issn : chr "1559-2316"
#> .. ..$ nlmid : chr "101291431"
#> .. ..$ essn : chr "1559-2324"
#> $ pubYear : chr "2016"
#> $ pageInfo : chr "e1161876"
#> $ abstractText : chr "The essential micronutrient vitamin B6 is best known in its enzymatic cofactor form, pyridoxal 5'-phosphate (PLP). However, vit"| __truncated__
#> $ affiliation : chr "a Department of Botany and Plant Biology , University of Geneva , Geneva , Switzerland."
#> $ language : chr "eng"
#> $ pubModel : chr "Print"
#> $ pubTypeList :List of 1
#> ..$ pubType: chr [1:2] "Journal Article" "Research Support, Non-U.S. Gov't"
#> $ meshHeadingList :List of 1
#> ..$ meshHeading:List of 9
#> .. ..$ :List of 3
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName : chr "Arabidopsis"
#> .. .. ..$ meshQualifierList:List of 1
#> .. .. .. ..$ meshQualifier:List of 2
#> .. .. .. .. ..$ :List of 3
#> .. .. .. .. .. ..$ abbreviation : chr "GE"
#> .. .. .. .. .. ..$ qualifierName: chr "genetics"
#> .. .. .. .. .. ..$ majorTopic_YN: chr "N"
#> .. .. .. .. ..$ :List of 3
#> .. .. .. .. .. ..$ abbreviation : chr "IM"
#> .. .. .. .. .. ..$ qualifierName: chr "immunology"
#> .. .. .. .. .. ..$ majorTopic_YN: chr "N"
#> .. ..$ :List of 3
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName : chr "Nitrogen"
#> .. .. ..$ meshQualifierList:List of 1
#> .. .. .. ..$ meshQualifier:List of 1
#> .. .. .. .. ..$ :List of 3
#> .. .. .. .. .. ..$ abbreviation : chr "ME"
#> .. .. .. .. .. ..$ qualifierName: chr "metabolism"
#> .. .. .. .. .. ..$ majorTopic_YN: chr "Y"
#> .. ..$ :List of 3
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName : chr "Vitamin B 6"
#> .. .. ..$ meshQualifierList:List of 1
#> .. .. .. ..$ meshQualifier:List of 1
#> .. .. .. .. ..$ :List of 3
#> .. .. .. .. .. ..$ abbreviation : chr "ME"
#> .. .. .. .. .. ..$ qualifierName: chr "metabolism"
#> .. .. .. .. .. ..$ majorTopic_YN: chr "Y"
#> .. ..$ :List of 3
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName : chr "Arabidopsis Proteins"
#> .. .. ..$ meshQualifierList:List of 1
#> .. .. .. ..$ meshQualifier:List of 1
#> .. .. .. .. ..$ :List of 3
#> .. .. .. .. .. ..$ abbreviation : chr "ME"
#> .. .. .. .. .. ..$ qualifierName: chr "metabolism"
#> .. .. .. .. .. ..$ majorTopic_YN: chr "N"
#> .. ..$ :List of 2
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName: chr "Temperature"
#> .. ..$ :List of 2
#> .. .. ..$ majorTopic_YN : chr "Y"
#> .. .. ..$ descriptorName: chr "Autoimmunity"
#> .. ..$ :List of 2
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName: chr "Gene Expression Regulation, Plant"
#> .. ..$ :List of 2
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName: chr "Reproduction"
#> .. ..$ :List of 2
#> .. .. ..$ majorTopic_YN : chr "N"
#> .. .. ..$ descriptorName: chr "Phenotype"
#> $ keywordList :List of 1
#> ..$ keyword: chr [1:8] "Arabidopsis thaliana" "Autoimmunity" "plant defense" "Vitamin B6" ...
#> $ chemicalList :List of 1
#> ..$ chemical:List of 3
#> .. ..$ :List of 2
#> .. .. ..$ name : chr "Arabidopsis Proteins"
#> .. .. ..$ registryNumber: chr "0"
#> .. ..$ :List of 2
#> .. .. ..$ name : chr "Vitamin B 6"
#> .. .. ..$ registryNumber: chr "8059-24-3"
#> .. ..$ :List of 2
#> .. .. ..$ name : chr "Nitrogen"
#> .. .. ..$ registryNumber: chr "N762921K75"
#> $ subsetList :List of 1
#> ..$ subset:List of 1
#> .. ..$ :List of 2
#> .. .. ..$ code: chr "IM"
#> .. .. ..$ name: chr "Index Medicus"
#> $ fullTextUrlList :List of 1
#> ..$ fullTextUrl:List of 3
#> .. ..$ :List of 5
#> .. .. ..$ availability : chr "Free"
#> .. .. ..$ availabilityCode: chr "F"
#> .. .. ..$ documentStyle : chr "pdf"
#> .. .. ..$ site : chr "Europe_PMC"
#> .. .. ..$ url : chr "http://europepmc.org/articles/PMC4883958?pdf=render"
#> .. ..$ :List of 5
#> .. .. ..$ availability : chr "Free"
#> .. .. ..$ availabilityCode: chr "F"
#> .. .. ..$ documentStyle : chr "html"
#> .. .. ..$ site : chr "Europe_PMC"
#> .. .. ..$ url : chr "http://europepmc.org/articles/PMC4883958"
#> .. ..$ :List of 5
#> .. .. ..$ availability : chr "Subscription required"
#> .. .. ..$ availabilityCode: chr "S"
#> .. .. ..$ documentStyle : chr "doi"
#> .. .. ..$ site : chr "DOI"
#> .. .. ..$ url : chr "http://dx.doi.org/10.1080/15592324.2016.1161876"
#> $ isOpenAccess : chr "N"
#> $ inEPMC : chr "Y"
#> $ inPMC : chr "N"
#> $ hasPDF : chr "Y"
#> $ hasBook : chr "N"
#> $ hasSuppl : chr "N"
#> $ citedByCount : int 0
#> $ hasReferences : chr "Y"
#> $ hasTextMinedTerms : chr "Y"
#> $ hasDbCrossReferences : chr "N"
#> $ hasLabsLinks : chr "N"
#> $ epmcAuthMan : chr "N"
#> $ hasTMAccessionNumbers: chr "N"
#> $ dateOfCompletion : chr "2016-12-30"
#> $ dateOfCreation : chr "2016-05-11"
#> $ dateOfRevision : chr "2016-12-31"
#> $ firstPublicationDate : chr "2016-03-28"
#> $ embargoDate : chr "2016-09-28"
The core results have all the lite fields plus MeSH terms, abstracts and others. Have you considered parsing resulttype=core so users can get MeSH terms for hundreds of articles at once? I have started an XML parser to get some core fields that might help.