ropensci / rentrez

talk with NCBI entrez using R
https://docs.ropensci.org/rentrez
Other
195 stars 38 forks source link

Multiple PMIDs #131

Closed agbarnett closed 5 years ago

agbarnett commented 5 years ago

I'm getting multiple PMIDs when searching for a single PMID, for example:

parse_pubmed_xml(entrez_fetch(db="pubmed", id='29743284', rettype="xml"))$pmid
[1] "29743284" "22116114" "19123933" "20091554" "27544377" "21466679" "10501796" "19657121" "12511679" "8560339"  "17043045" "25923524"
[13] "15507789" "19657122" "22773335" "8237484"  "22019853" "4139420"  "24325929"    "16923978" "19465306" "8950879"  "18165753" "20701962"
[25] "20190623"

The first PMID is the right one, the others seem unrelated. I can easily work around this by just selecting the first PMID, but is there a bigger issue with related data?

dwinter commented 5 years ago

Thanks for filing this issue @agbarnett , looks like this is a bug in the way parse_pubmed_xml parses the record (looking for any PMID field, not specifically the one at the top-level of the record).

Will get a fix and test in the next little while. I think you can safetly everything after the first one for now (as the citation info with the correct pmid is in the top of the file).