cmks / DAS_Tool

DAS Tool
Other
140 stars 17 forks source link

Execution Halted #35

Closed Arkadiy-Garber closed 2 years ago

Arkadiy-Garber commented 5 years ago

DAT_Tool was downloaded today using:

conda install -c bioconda das_tool

I did not have usearch installed, so opted for the --search_engine blast option.

Got an "Execution Halted" immediately after the BLAST step: identifying single copy genes using diamond version 0.9.22

with the --debug flag, I saw that the error was raised by "usearch: command not found error. After installing USEARCH, I attempted again, and it seems to work fine now.

Just an FYI, since the README states that only one of the following: BLAST+, DIAMOND, USEARCH is required. But even after choosing --search_engine blast, the absence of USEARCH causes the program to crash.

Thanks! Overall, I think this is a great and useful tool. Arkadiy

jolespin commented 5 years ago

I'm also getting this. I've tried blast, usearch, and diamond. Same error w/ nothing in the debug. Do you know what it could be? I want to make this a key component of a pipeline I'm working on. If it doesn't work I might try metawrap but DAS_Tool is my first option.

(mage_env) -bash-4.1$ which blastp
/usr/local/devel/ANNOTATION/jespinoz/anaconda/envs/mage_env/bin/blastp
(mage_env) -bash-4.1$ DAS_Tool -i ./initial_binning/concoct_contigs.tsv,./initial_binning/metabat2_contigs.tsv,./initial_binning/maxbin2_contigs.tsv --db_directory ~/Tools/DAS_Tool/db -c ../blastp_output/query_contigs_with_hits.fna -t 4 --proteins ../prodigal_output/query_orfs.faa --debug --search_engine blast
Running DAS Tool using 4 threads.
identifying single copy genes using blastp: 2.2.31+ Package: blast 2.2.31, build Jun 2 2015 10:20:04
Execution halted
(mage_env) -bash-4.1$ which usearch
/usr/local/bin/usearch
(mage_env) -bash-4.1$ DAS_Tool -i ./initial_binning/concoct_contigs.tsv,./initial_binning/metabat2_contigs.tsv,./initial_binning/maxbin2_contigs.tsv --db_directory ~/Tools/DAS_Tool/db -c ../blastp_output/query_contigs_with_hits.fna -t 4 --proteins ../prodigal_output/query_orfs.faa --debug --search_engine usearch
Running DAS Tool using 4 threads.
identifying single copy genes using usearch v8.0.1616_i86linux64
Execution halted
(mage_env) -bash-4.1$ which diamond
/usr/local/devel/ANNOTATION/jespinoz/anaconda/envs/mage_env/bin/diamond
(mage_env) -bash-4.1$ DAS_Tool -i ./initial_binning/concoct_contigs.tsv,./initial_binning/metabat2_contigs.tsv,./initial_binning/maxbin2_contigs.tsv --db_directory ~/Tools/DAS_Tool/db -c ../blastp_output/query_contigs_with_hits.fna -t 4 --proteins ../prodigal_output/query_orfs.faa --debug --search_engine diamond
Running DAS Tool using 4 threads.
identifying single copy genes using diamond version 0.9.21
Execution halted
cmks commented 5 years ago

Hi Josh, what operating system are you using? Can you please check the version of ruby and prodigal (that generated the query_orfs.faa) on your system? Also, can you check which files are generated in the output directory (.scg, .seqlength,...)?

The problem might be related to the bioconda installation. Have you tried installing DAS Tool directly from github?

jolespin commented 5 years ago

I ran it on our Linux server. Is there an error log that gives more details that is generated?

jolespin commented 5 years ago

I didn't install via conda. I believe I downloaded it separately from github, installed the deps, and then linked to executable to my conda env bin.

cmks commented 5 years ago

Is DAS Tool running on the sample data? Make sure you use a recent version of prodigal (>= 2.6.3).

Btw. I just tested the bioconda installation. It worked for me without issues on ubuntu after installing the data.table r-package separately.

jolespin commented 5 years ago

I have both data.table and the proper prodgial version

(mage_env) -bash-4.1$ prodigal -v

Prodigal V2.6.3: February, 2016
(mage_env) -bash-4.1$ R

R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-conda_cos6-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(data.table)
data.table 1.11.4  Latest news: http://r-datatable.com

I haven't tried it on the test data yet but I will try. Do you have a link to the preferred test data examples or does it come with the download?

cmks commented 5 years ago

Yes, it's part of this repository: https://github.com/cmks/DAS_Tool/tree/master/sample_data

hqtdave commented 5 years ago

Hi guys, I'm getting the same issue. Have you got the solution for this? I have tried the test data, it worked well. But my data always gets execution halted issue.

cmks commented 5 years ago

Is DAS Tool running on the sample data for you (https://github.com/cmks/DAS_Tool/tree/master/sample_data)? Have you tried setting the --debug flag and check the log file for any hints?

jianshu93 commented 4 years ago

I have run multiple samples on different node on a superpercomputer (run one sample at each node). Sometimes for some sample they just failed, showing (identifying single copy genes using diamond/usearch version 0.9.21/11.0.447 Execution halted). Other samples succeed. I have to run those failed samples again using exactly the same code and again, some of them failed, some of them succeed...until the end...I can finish all samples...I am using usearch64 bit. It is really strange...Since they can be finished in the end, I don't think there are something wrong. And idea why DAs_tools is unstable? For those failed samples, I always have proteins.faa, proteins.faa.archaea.scg, proteins.faa.bacteria.scg, .seqlength.

jolespin commented 2 years ago

I'm also getting this error:

Version

DAS Tool version 1.1.2

Command

DAS_Tool --bins ${S2B_ARRAY[0]} --contigs scaffolds.fasta --outputbasename _ --labels ${S2B_ARRAY[1]} --search_engine diamond --write_bins 1 --threads 16 --proteins gene_models.faa --debug

Error

Error in aggregate.data.frame(arc_scg["count"], by = arc_scg[c("Archaeal.SCG",  :
  no rows to aggregate
Calls: cherry_pick -> aggregate -> aggregate.data.frame
jolespin commented 2 years ago

I'm getting this error again for 1.1.2 and 1.1.3. I wasn't able to get 1.1.4 to run (see https://github.com/cmks/DAS_Tool/issues/81).

I've attached the files that caused this: dastool_error.zip

cmks commented 2 years ago

see #81