Closed adourian1 closed 5 years ago
I think this is a similar issue to #11 - I am digging into the problem and am trying to reproduce it on my environment.
maybe I'm wrong and I'm saying something stupid but I ran into similar problems in the past with another package. In that case the problems originated from the pubmed server timeout. Could this be the underlying problem? Now I cannot verify it but previously I had identified the origin of the error using a query that produced very few results (less than 10) and verifying that in these conditions the error disappeared. I hope I was helpful and I did not confused your ideas!
That is precisely the error, but I tried to bake in some stuff that could mitigate that to some extent. It also seems that people are getting kicked off more frequently... so something is going on.
take a look at the documentation of the "easyPubMed" package. If I remember correctly it included a Sys.sleep() function to avoid the timeout problem. Maybe that can help to solve the problem with your library.
I also think this is a very valuable package, but it isn't working for me unless I select 10 or fewer results. I get this message amongst others listed:
Warning in file(con, "r") :
cannot open URL 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=NA,30810520&retmode=xml': HTTP status was '429 Too Many Requests'
According to the following, there are stricter limits to the number of queries made without an API key, after May 2018 than before. https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/
Perhaps it was tested before May 2018 but since the functionality of the web server and API has changed?
Ah! This helps explain why I can't reproduce the error - I am able to run bigger queries than 10 on my end, but I don't run queries that often and so I didn't seem to be hitting the wall. I'll take a look into incorporating the API key more seamlessly for users, hopefully this will help.
Updated Adjutant to allow for NCBI API Key, also added some more throttles to reduce the requests hitting the server.
Many thanks! The API key and other throttles appear to have fixed the original connection issue. However now am running into Issue #11 upon clustering (https://github.com/amcrisan/Adjutant/issues/11)...
That error is a bit harder to address because it is difficult for me reproduce - do you have a query that I can try out? Maybe there is something specific that I am not catching. But I am digging into this, trying to make the error happen so I can address it.
Thank you very much for the quick response. I've been using the 'Load an example query' option on the main Search page ((outbreak OR epidemic OR pandemic) AND genom*)), and had reinstalled Adjutant 0.1.0 a few hours ago.; my dplyr version is 0.8.0.1. Here's the console up to the crash:
runAdjutant()
Listening on http://127.0.0.1:3358
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
[1] "These are missing"
[1] "30335311"
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
Warning: Column PMID
joining character vector and factor, coercing into character vector
[1] "These are missing"
[1] NA
Warning: Column PMID
joining character vector and factor, coercing into character vector
Selecting by n
Joining, by = "word"
Joining, by = "word"
Warning: Column word
joining character vector and factor, coercing into character vector
Warning: Error in : object 'nn' not found
88: filter_impl
87: filter.tbl_df
85: function_list[[k]]
83: freduce
82: _fseq
81: eval
80: eval
78: %>%
77: tidyCorpus
73: observeEventHandler [C:\Users\aadourian\Documents\R\win-library\3.5\adjutant\shinyapp/server.R#288]
2: shiny::runApp
1: runAdjutant
That was helpful - the issue is the dplyr version, I had 0.7.8 whereas 0.8.0 has slightly different behaviour for the group_by variable, which is what's causing this error. Working on fixing this now throughout the app and making it be backwards compatible with v 2.7. Closing this particular issue thread, and will pick up the fix in #11
Appears to be a very valuable package, but have not been able to get it to work. Install is fine, and Shiny app starts upon runAdjutant(), but upon searching pubmed from the interface, the output the following :
Listening on http://127.0.0.1:5066 Warning: Column
73: observeEventHandler [C:\Users[...]\Documents\R\win-library\3.5\adjutant\shinyapp/server.R#124]
2: shiny::runApp
1: runAdjutant
PMID
joining character vector and factor, coercing into character vector [1] "Could not retrieve missing abstract" [1] "Could not retrieve missing abstract" [...] [1] "Could not retrieve missing abstract" Warning in file(con, "r") : cannot open URL 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=30798036,30793935,30793933,30793916,30793637,30793006 [...]' [... truncated] Warning: Error in file: cannot open the connection 85: file 84: readLines 83: FUN 82: lapply 81: EUtilsGet 80: formatData 78: