ropensci / phylotaR

An automated pipeline for retrieving orthologous DNA sequences from GenBank in R
https://docs.ropensci.org/phylotaR
Other
23 stars 9 forks source link

PhylotaR setup issue #39

Closed pittaro90 closed 2 years ago

pittaro90 commented 5 years ago

Hi, I'm trying to run phylotaR in R 3.4.4 with the following code:

devtools::install_github('ropensci/phylotaR') library(phylotaR) wd<-getwd() ncbi_dr<-"C:/Program Files/ncbi-blast-2.7.1+/bin" txid<-9504 setup(wd = wd, txid = txid, ncbi_dr = ncbi_dr, v = TRUE)

then I get:


phylotaR: Implementation of PhyLoTa in R [v1.0.0]

Checking for valid NCBI BLAST+ Tools ... Error in if (tst) { : missing value where TRUE/FALSE needed Inoltre: Warning message: In blast_setup(d = ncbi_dr, v = v, wd = wd) : NAs introduced by coercion

Do you have any suggestion? I could not find a solution for this problem

Best, Matteo

DomBennett commented 5 years ago

Hi Matteo,

It seems there is an error when trying to determine the version of your installed version of the NCBI BLAST+ tools.

It is hard for me to examine exactly where things are going wrong as I don't have access to your computer. Could you run the following script? It basically recreates what phylotaR does internally. We should expect vrsn to become a numeric vector but on your machine it appears to be an empty vector. If we can work out how this occurs then we should be able to fix the bug.

cmd_path <- file.path('C:/Program Files/ncbi-blast-2.7.1+/bin', 'blastn')
res <- phylotaR:::cmdln(cmd = cmd_path, args = '-version')
# Is the status TRUE?
(res[['status']] == 0)
# What are the BLASTN details?
stdout <- rawToChar(res[['stdout']])
(stdout <- strsplit(x = stdout, split = '\n')[[1]])
# Can we extract the version number?
vrsn <- gsub('[a-zA-Z:+]', '', stdout[[1]])
vrsn <- gsub('\\s', '', vrsn)
(vrsn <- as.numeric(strsplit(vrsn, '\\.')[[1]]))
# Version should be 2 +, expect TRUE
(vrsn[1] >= 2 & vrsn[2] >= 0)

For reference, here is the above code working for my machine.

> cmd_path <- file.path('/usr/local/ncbi/blast/bin', 'blastn')
> res <- phylotaR:::cmdln(cmd = cmd_path, args = '-version')
> # Is the status TRUE?
> (res[['status']] == 0)
[1] TRUE
> # What are the BLASTN details?
> stdout <- rawToChar(res[['stdout']])
> (stdout <- strsplit(x = stdout, split = '\n')[[1]])
[1] "blastn: 2.2.29+"                                  
[2] "Package: blast 2.2.29, build Dec 10 2013 15:51:59"
> # Can we extract the version number?
> vrsn <- gsub('[a-zA-Z:+]', '', stdout[[1]])
> vrsn <- gsub('\\s', '', vrsn)
> (vrsn <- as.numeric(strsplit(vrsn, '\\.')[[1]]))
[1]  2  2 29
> # Version should be 2 +, expect TRUE
> (vrsn[1] >= 2 & vrsn[2] >= 0)
[1] TRUE
pittaro90 commented 5 years ago

Thank you for the answer!

I ran the code and I ge this:

cmd_path <- file.path('C:/Program Files/ncbi-blast-2.7.1+/bin', 'blastn') res <- phylotaR:::cmdln(cmd = cmd_path, args = '-version')

Is the status TRUE?

(res[['status']] == 0) [1] TRUE

What are the BLASTN details?

stdout <- rawToChar(res[['stdout']]) (stdout <- strsplit(x = stdout, split = '\n')[[1]]) [1] "blastn: 2.7.1+\r"
[2] " Package: blast 2.7.1, build Oct 18 2017 19:55:35\r"

Can we extract the version number?

vrsn <- gsub('[a-zA-Z:+]', '', stdout[[1]]) vrsn <- gsub('\s', '', vrsn) (vrsn <- as.numeric(strsplit(vrsn, '\.')[[1]])) [1] 2 7 1

Version should be 2 +, expect TRUE

(vrsn[1] >= 2 & vrsn[2] >= 0) [1] TRUE

DomBennett commented 5 years ago

OK. That all looks fine. I expected that to fail.

Can you repeat the code I sent you but with makeblastdb instead of blastn? E.g.

cmd_path <- file.path('C:/Program Files/ncbi-blast-2.7.1+/bin',, 'makeblastdb')
res <- phylotaR:::cmdln(cmd = cmd_path, args = '-version')
# Is the status TRUE?
(res[['status']] == 0)
# What are the details?
stdout <- rawToChar(res[['stdout']])
(stdout <- strsplit(x = stdout, split = '\n')[[1]])
# Can we extract the version number?
vrsn <- gsub('[a-zA-Z:+]', '', stdout[[1]])
vrsn <- gsub('\\s', '', vrsn)
(vrsn <- as.numeric(strsplit(vrsn, '\\.')[[1]]))
# Version should be 2 +, expect TRUE
(vrsn[1] >= 2 & vrsn[2] >= 0)

Also, can you try running this? It should return a list of the NCBI paths. Given your code error it should fail.

cmd_path <- 'C:/Program Files/ncbi-blast-2.7.1+/bin'
(phylotaR:::blast_setup(d = cmd_path, v = TRUE, wd = getwd()))
pittaro90 commented 5 years ago

This is what I get:

cmd_path <- file.path('C:/Program Files/ncbi-blast-2.7.1+/bin/makeblastdb') res <- phylotaR:::cmdln(cmd = cmd_path, args = '-version')

Is the status TRUE?

(res[['status']] == 0) [1] TRUE

What are the details?

stdout <- rawToChar(res[['stdout']]) (stdout <- strsplit(x = stdout, split = '\n')[[1]]) [1] "MAKEBL~1: 2.7.1+\r" " Package: blast 2.7.1, build Oct 18 2017 19:55:35\r"

Can we extract the version number?

vrsn <- gsub('[a-zA-Z:+]', '', stdout[[1]]) vrsn <- gsub('\s', '', vrsn) (vrsn <- as.numeric(strsplit(vrsn, '\.')[[1]])) [1] NA 7 1 Warning message: NAs introduced by coercion

Version should be 2 +, expect TRUE

(vrsn[1] >= 2 & vrsn[2] >= 0) [1] NA

cmd_path <- 'C:/Program Files/ncbi-blast-2.7.1+/bin' (phylotaR:::blast_setup(d = cmd_path, v = TRUE, wd = getwd())) Checking for valid NCBI BLAST+ Tools ... Error in if (tst) { : missing value where TRUE/FALSE needed In addition: Warning message: In phylotaR:::blast_setup(d = cmd_path, v = TRUE, wd = getwd()) : NAs introduced by coercion

DomBennett commented 5 years ago

Hi Matteo,

OK. So it looks like we're getting an unexpected "~1:" in the makeblastdb version description. I've made a fix to hopefully correct the problem for you.

Can you try installing the development version and running your pipeline code again? To install the latest development version on GitHub:

library(devtools)
install_github('ropensci/phylotaR')

Dom

pittaro90 commented 5 years ago

Hi Dom, thank you very much for your help. Notit works.

However another issue came out:

During the blast phase an error occured:

-Unexpected Error : blastn failed to run. Check BLAST log files.

According to the blast log file:

-BLAST query/options error: '"6' is not a valid output format

Best,

Matteo

DomBennett commented 5 years ago

Hi Matteo,

Sorry for the persistent problems. I've created some new checker functions during the setup step. If you update phylotaR to the latest on GitHub and try running your pipeline again, does it still work? And does the error crop up again? It should get caught earlier.

agamisch commented 5 years ago

Hi,

i installed phylotaR yesterday and also received the same error as pittaro:


phylotaR: Implementation of PhyLoTa in R [v1.0.0]

Checking for valid NCBI BLAST+ Tools ... Found: [C:/Program Files/NCBI/blast-2.7.1+/bin/makeblastdb] Found: [C:/Program Files/NCBI/blast-2.7.1+/bin/blastn] Setting up pipeline with the following parameters: . blstn [C:/Program Files/NCBI/blast-2.7.1+/bin/blastn] . btchsz [100] . date [2019-02-04] . mdlthrs [3000] . mkblstdb [C:/Program Files/NCBI/blast-2.7.1+/bin/makeblastdb] . mncvrg [51] . mnsql [250] . mxevl [1e-10] . mxnds [1e+05] . mxrtry [100] . mxsql [2000] . mxsqs [50000] . ncps [1] . txid [9504] . v [TRUE] . wd [E:/Rzeugs/phylotaR/test_bulb]


Running pipeline on [windows] at [2019-02-04 16:05:48]

Running stages: taxise, download, cluster, cluster2

Starting stage TAXISE: [2019-02-04 16:05:48]

Searching taxonomic IDs ... Downloading taxonomic records ... . [1-21] Generating taxonomic dictionary ...

Completed stage TAXISE: [2019-02-04 16:05:51]


Starting stage DOWNLOAD: [2019-02-04 16:05:51]

Identifying suitable clades ... Identified [1] suitable clades. Downloading hierarchically ... Working on parent [id 9504]: [1/1] ... . + whole subtree ... . . Getting [2800 sqs] ... . . . [1-100] . . . [101-200] . . . [201-300] . . . [301-400] . . . [401-500] . . . [501-600] . . . [601-700] . . . [701-800] . . . [801-900] . . . [901-1000] . . . [1001-1100] . . . [1101-1200] . . . [1201-1300] . . . [1301-1400] . . . [1401-1500] . . . [1501-1600] . . . [1601-1700] . . . [1701-1800] . . . [1801-1900] . . . [1901-2000] . . . [2001-2100] . . . [2101-2200] . . . [2201-2300] . . . [2301-2400] . . . [2401-2500] . . . [2501-2600] . . . [2601-2700] . . . [2701-2800] Successfully downloaded [2681 sqs] in total.

Completed stage DOWNLOAD: [2019-02-04 16:08:50]


Starting stage CLUSTER: [2019-02-04 16:08:50]

Working on [id 9504] . Generating subtree clusters for [id 9504 (genus)] . Generating direct clusters for [id 9504(genus)] . . [0 sqs] . . . Too few sequences, cannot make clusters . BLASTing [2681 sqs] .... . . Running makeblastdb . . Running blastn blastn failed to run. Check BLAST log files. Unexpected Error : blastn failed to run. Check BLAST log files.

Occurred [2019-02-04 16:08:52] Contact package maintainer for help.

Blast log file: BLAST query/options error: '"6' is not a valid output format

All the best!

A

pittaro90 commented 5 years ago

Hi, sorry for my late reply. Now it's working

basil-yakimov commented 5 years ago

I confirm @agamisch issue. Trying to reproduce Aotus example I got the following error: Unexpected Error : blastn failed to run. Check BLAST log files. And inside the log file: BLAST query/options error: '"6' is not a valid output format

DomBennett commented 5 years ago

Hi @basil-yakimov,

Are you using the latest version of phylotaR, downloaded from GitHub?

Try running:

remotes::install_github('ropensci/phylotaR')

I recently fixed issues relating to that particular error.

Otherwise, if the error is still occurring I will need more information about your system and set-up.

Thanks!

basil-yakimov commented 5 years ago

I am using version from CRAN. Here is a report after the attempt to install from github:

remotes::install_github('ropensci/phylotaR') Downloading GitHub repo ropensci/phylotaR@master These packages have more recent versions available. Which would you like to update?
1: All 2: CRAN packages only 3: None 4: curl (4.2 -> 4.3 ) [CRAN] 5: igraph (1.2.4.1 -> 1.2.4.2) [CRAN]
Enter one or more numbers separated by spaces, or an empty line to cancel 1: 1 curl (4.2 -> 4.3 ) [CRAN] igraph (1.2.4.1 -> 1.2.4.2) [CRAN] Skipping 1 packages not available: restez Installing 3 packages: curl, igraph, restez Installing packages into ‘C:/Users/Basil/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) Error: (converted from warning) package ‘restez’ is not available (for R version 3.5.3)

Then I updated my R from 3.5.3 to 3.6.1. Got the same error:

Error: Failed to install 'phylotaR' from GitHub: (converted from warning) package ‘restez’ is not available (for R version 3.6.1)

joelnitta commented 2 years ago

Previously restez was taken down from CRAN, but it is now available again (https://cran.r-project.org/web/packages/restez/index.html). Installing phylotaR should now be possible.

ShixiangWang commented 2 years ago

Thanks to @joelnitta for pointing out the update of restez. I am closing this now, as the test is passed in current version:

$ radian
R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Platform: x86_64-pc-linux-gnu (64-bit)

r$> library(phylotaR)
    wd<-getwd()

r$> txid<-9504

r$> setup(wd = wd, txid = txid, ncbi_dr = "/workspaces//blast", v = TRUE)
-----------------------------------------------------
phylotaR: Implementation of PhyLoTa in R [v1.2.0.999]
-----------------------------------------------------
Checking for valid NCBI BLAST+ Tools ...
Failed to run: [/workspaces/blast/makeblastdb]. Reason:
[Error : Failed to execute '/workspaces/blast/makeblastdb' (No such file or directory)
]Failed to run: [/workspaces/blast/blastn]. Reason:
[Error : Failed to execute '/workspaces/blast/blastn' (No such file or directory)
]Error:Unable to find correct versions of NCBI BLAST+ tools
Error in blast_setup(d = ncbi_dr, v = v, wd = wd, otsdr = outsider) : 
  Unable to find correct versions of NCBI BLAST+ tools

r$> setup(wd = wd, txid = txid, ncbi_dr = "/workspaces//blast/bin", v = TRUE)
-----------------------------------------------------
phylotaR: Implementation of PhyLoTa in R [v1.2.0.999]
-----------------------------------------------------
Checking for valid NCBI BLAST+ Tools ...
Found: [/workspaces/blast/bin/makeblastdb]
Found: [/workspaces/blast/bin/blastn]
. . Running makeblastdb
Setting up pipeline with the following parameters:
. blstn          [/workspaces/blast/bin/blastn]
. btchsz         [100]
. date           [2022-09-18]
. db_only        [FALSE]
. mdlthrs        [3000]
. mkblstdb       [/workspaces/blast/bin/makeblastdb]
. mncvrg         [51]
. mnsql          [250]
. multiple_ids   [FALSE]
. mxevl          [1e-10]
. mxnds          [1e+05]
. mxrtry         [100]
. mxsql          [2000]
. mxsqs          [50000]
. ncps           [1]
. outfmt         [6 qseqid sseqid pident length evalue qcovs qcovhsp]
. outsider       [FALSE]
. srch_trm       [NOT predicted[TI] NOT "whole genome shotgun"[TI] NOT unverified[TI] NOT "synthetic construct"[Organism] NOT refseq[filter] NOT TSA[Keyword]]
. txid           [9504]
. v              [TRUE]
. wd             [/workspaces/phylotaR]
. wt_tms         [1, 3, 6 ...]
-----------------------------------------------------
joelnitta commented 2 years ago

Great! Please let me know if you encounter any problems with restez.