amcrisan / Adjutant

Runs a pubmed query, returns results and allows user to explore high-level structure of returned documents
MIT License
66 stars 21 forks source link

Warning in file(con, "r") #12

Closed adourian1 closed 5 years ago

adourian1 commented 5 years ago

Appears to be a very valuable package, but have not been able to get it to work. Install is fine, and Shiny app starts upon runAdjutant(), but upon searching pubmed from the interface, the output the following :

runAdjutant()

Listening on http://127.0.0.1:5066 Warning: Column PMID joining character vector and factor, coercing into character vector [1] "Could not retrieve missing abstract" [1] "Could not retrieve missing abstract" [...] [1] "Could not retrieve missing abstract" Warning in file(con, "r") : cannot open URL 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=30798036,30793935,30793933,30793916,30793637,30793006 [...]' [... truncated] Warning: Error in file: cannot open the connection 85: file 84: readLines 83: FUN 82: lapply 81: EUtilsGet 80: formatData 78: 73: observeEventHandler [C:\Users[...]\Documents\R\win-library\3.5\adjutant\shinyapp/server.R#124] 2: shiny::runApp 1: runAdjutant

amcrisan commented 5 years ago

I think this is a similar issue to #11 - I am digging into the problem and am trying to reproduce it on my environment.

pixy61 commented 5 years ago

maybe I'm wrong and I'm saying something stupid but I ran into similar problems in the past with another package. In that case the problems originated from the pubmed server timeout. Could this be the underlying problem? Now I cannot verify it but previously I had identified the origin of the error using a query that produced very few results (less than 10) and verifying that in these conditions the error disappeared. I hope I was helpful and I did not confused your ideas!

amcrisan commented 5 years ago

That is precisely the error, but I tried to bake in some stuff that could mitigate that to some extent. It also seems that people are getting kicked off more frequently... so something is going on.

pixy61 commented 5 years ago

take a look at the documentation of the "easyPubMed" package. If I remember correctly it included a Sys.sleep() function to avoid the timeout problem. Maybe that can help to solve the problem with your library.

JonMinton commented 5 years ago

I also think this is a very valuable package, but it isn't working for me unless I select 10 or fewer results. I get this message amongst others listed:

Warning in file(con, "r") :
  cannot open URL 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=NA,30810520&retmode=xml': HTTP status was '429 Too Many Requests'

According to the following, there are stricter limits to the number of queries made without an API key, after May 2018 than before. https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/

Perhaps it was tested before May 2018 but since the functionality of the web server and API has changed?

amcrisan commented 5 years ago

Ah! This helps explain why I can't reproduce the error - I am able to run bigger queries than 10 on my end, but I don't run queries that often and so I didn't seem to be hitting the wall. I'll take a look into incorporating the API key more seamlessly for users, hopefully this will help.

amcrisan commented 5 years ago

Updated Adjutant to allow for NCBI API Key, also added some more throttles to reduce the requests hitting the server.

adourian1 commented 5 years ago

Many thanks! The API key and other throttles appear to have fixed the original connection issue. However now am running into Issue #11 upon clustering (https://github.com/amcrisan/Adjutant/issues/11)...

amcrisan commented 5 years ago

That error is a bit harder to address because it is difficult for me reproduce - do you have a query that I can try out? Maybe there is something specific that I am not catching. But I am digging into this, trying to make the error happen so I can address it.

adourian1 commented 5 years ago

Thank you very much for the quick response. I've been using the 'Load an example query' option on the main Search page ((outbreak OR epidemic OR pandemic) AND genom*)), and had reinstalled Adjutant 0.1.0 a few hours ago.; my dplyr version is 0.8.0.1. Here's the console up to the crash:

runAdjutant()

Listening on http://127.0.0.1:3358 Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector [1] "These are missing" [1] "30335311" Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector Warning: Column PMID joining character vector and factor, coercing into character vector [1] "These are missing" [1] NA Warning: Column PMID joining character vector and factor, coercing into character vector Selecting by n Joining, by = "word" Joining, by = "word" Warning: Column word joining character vector and factor, coercing into character vector Warning: Error in : object 'nn' not found 88: filter_impl 87: filter.tbl_df 85: function_list[[k]] 83: freduce 82: _fseq 81: eval 80: eval 78: %>% 77: tidyCorpus 73: observeEventHandler [C:\Users\aadourian\Documents\R\win-library\3.5\adjutant\shinyapp/server.R#288] 2: shiny::runApp 1: runAdjutant

amcrisan commented 5 years ago

That was helpful - the issue is the dplyr version, I had 0.7.8 whereas 0.8.0 has slightly different behaviour for the group_by variable, which is what's causing this error. Working on fixing this now throughout the app and making it be backwards compatible with v 2.7. Closing this particular issue thread, and will pick up the fix in #11