ropensci / rentrez

talk with NCBI entrez using R
https://docs.ropensci.org/rentrez
Other
194 stars 38 forks source link

HTTP failure 414, the request is too large for entrez_fetch #168

Open thdiakon opened 3 years ago

thdiakon commented 3 years ago

Hi!

I am quite troubled with big queries in pubmed.

The problem is as follows: I try to use entrez_fetch to extract parsed xml file as I need it to extract abstracts and mesh terms from pubmed.

fetch.pubmed <- entrez_fetch(db = "pubmed", id = filtered, rettype = "xml", parsed = T)

Abstracts_Big <- xpathApply(fetch.pubmed, '//PubmedArticle//Article', function(x)
    xmlValue(xmlChildren(x)$Abstract))

It works like a charm if my filtered vector contains less than 300 pmids but, as also other people mentioned it exits with an error when exceeds that limit:

Error in entrez_check(response) : HTTP failure 414, the request is too large. For large requests, try using web history as described in the rentrez tutorial

I have tried splitting into chunks e.g. http://www.nagraj.net/notes/chunking but my problem here is that I need one parsed xml file to extract the information and splitting or merging parsed xmls did not seem to work for me. I would appreciate any ideas about fixing that issue.

Thank you in advance Thodoros