ropensci / rentrez

talk with NCBI entrez using R
https://docs.ropensci.org/rentrez
Other
195 stars 38 forks source link

entrez_global_query #61

Closed gadepallivs closed 8 years ago

gadepallivs commented 9 years ago

Hi david, I am trying to query PMID to a global search something like searching all databases available at NCBI. http://www.ncbi.nlm.nih.gov/gquery/?term=26287849. The reason is, for Eg. When a researcher submit an article related to a gene where he finds a variant by NGS analysis. I expect that he must have submitted his sequencing data to SRA and the variant info., depending on type of variant , the information would have been submitted to dbVAR, SNP or Clinvar. So, my understanding is the PMID will relate to all the different db in NCBI in someway or other ? ? Hence, I am trying to perform the same using the functions in rentrez. The output I get from entrez_global_query function does not include Clinvar, MedGen and few other . Not sure, If I am missing something.

PubID.global <- entrez_global_query(term = "26287849")

# This results in hit, respective hits from dB are null. 
PubID.hits <- entrez_link(db = "all", id = "26287849", dbfrom = "pubmed")

Thank you for your help.

dwinter commented 9 years ago

Hi Monty,

For the global queries rentrez just returns what the NCBI gives us, if they miss out some databases there's no way around that.

Thankfully there is a better way :smile: . You can use entrez_link with dbfrom=pubmed

linked_all <- entrez_link(dbfrom="pubmed", id=26287849, db="all")
linked_all$links
elink result with information from 18 databases:
 [1] pubmed_gene                pubmed_gene_rif           
 [3] pubmed_geoprofiles         pubmed_medgen             
 [5] pubmed_nuccore_refseq      pubmed_nuccore_weighted   
 [7] pubmed_nucleotide_refseq   pubmed_pccompound_mesh    
 [9] pubmed_protein_refseq      pubmed_protein_weighted   
[11] pubmed_pubmed              pubmed_pubmed_alsoviewed  
[13] pubmed_pubmed_combined     pubmed_pubmed_five        
[15] pubmed_pubmed_reviews      pubmed_pubmed_reviews_five
[17] pubmed_taxonomy_entrez     pubmed_unigene

Each list element is a set of linked IDs

linked_all$links$pubmed_medgen
 [1] "287122" "224714" "163902" "9944"   "5568"   "850989" "850987" "830714"
 [9] "816486" "808161" "808135" "506452" "505377" "451981" "430218" "216027"
[17] "195765" "181539" "147065" "57450"  "40104"  "96929"  "11200"  "10294" 
[25] "7400"   "7399"   "2735"   "395223" "230896" "830701" "808167" "807559"
[33] "797334" "786450" "489430" "472094" "450439" "436865" "436446" "394209"
[41] "390810" "381473" "373146" "361808" "347898" "334280" "82696"  "78693" 
[49] "6642"  

Check out the new vignette for more info on this function