Closed gadepallivs closed 8 years ago
This is not really a problem with rentrez, just a property of NCBI records and R objects.
In this case, the pubtype field is variably-sized:
sapply(pubrecord.table[4,], length)
26287849 25979833 25667274 25430497 24968756 24846037 24296758 24281417
2 2 1 2 1 3 1 2
24128713 24055406 23489023
1 2 2
When you try and write the matrix it represents the vectrors like you'd type them in (c(..., ...)
) which adds a comma which breaks the csv format.
In this case, you can collapse the vectors:
pubrecord.table[4,] <- sapply(pubrecord.table[4,], paste, collapse=" & ")
and unlist each matrix row to allow them to be written out
f <- tempfile()
write.csv( apply(pubrecord.table, 1, unlist), f)
re_read <- read.csv(f)
re_read$pmcrefcount
[1] 0 1 3 2 1 26 10 4 3 2 21
Hi david,
The solution above works on certain PMID queries, but for others I still get an error. Depending on PMID the variable field lengths are noted in Title, Journal name , pubtype or something else. I thought just removing the row number will fix the issue. But, I get error when trying to write a table on Rshiny
pubrecord.table[,] <- sapply(pubrecord.table[,], paste, collapse=" & ")
Error in apply(pubrecord.reference, 1, unlist) : dim(X) must have a positive length
P.S Why was the function extract_form_esummary
designed to return a matrix ? The data it extracts is a mix of character, string , numeric vectors and so by definition dataframe would ideal to store these kind of data, while matrix is is expected to store data of the same type ?
I'm not sure what you are trying to in the example, but it seems like it's hitting empty fields?
extract_form_esummary
is really a wrapper to sapply
, it doesn't return data.frames
because I think most users don't expect data.frame
columns to contain vectors like
df <- as.data.frame(t(pubrecord.table))
df$pubtype
$`26287849`
[1] "Journal Article" "Multicenter Study"
$`25979833`
[1] "Journal Article" "Randomized Controlled Trial"
$`25667274`
[1] "Journal Article"
.
.
.
Structured data like that would seem to fit a list better than a data.frame
, and you can get that by setting simplify=FALSE
.
_Edited, noted the issue _ Hi david, I noted the issue was with empty abstract fields for some entries.
PM.ID <- c("26391251","26372702","26372699","26371045","26338018","26317919",
"26315966","26301800","26301799","26258891")
fetch.pubmed <- entrez_fetch(db = "pubmed", id = pubmed.search$ids,
rettype = "xml", parsed = T)
abstracts = xpathApply(fetch.pubmed, '//PubmedArticle//Article', function(x) xmlValue(xmlChildren(x)$Abstract))
This results in NA for PMIDs where abstracts are empty. But, when It is being rendered using Rshiny it has problem displaying the table just shows "Processing" but does not display any table. need to learn more about it. This is not related to rentrez package. Thank you
OK, good luck to getting to the bottom of the shiny problem :)
Hi David, Below is the example. I did not understand why title, fulljournalname, pubtype has the text data extending to second column.