cmks / DAS_Tool

DAS Tool
Other
140 stars 17 forks source link

Execution halted "Error: DAS Tool" in version v1.1.4 #81

Closed jolespin closed 2 years ago

jolespin commented 2 years ago

Version:

(dastool_env) -bash-4.2$ DAS_Tool --version
DAS Tool 1.1.4
(dastool_env) -bash-4.2$ DAS_Tool --bins ${S2B_ARRAY[0]} --contigs scaffolds.fasta --outputbasename dastool/_ --labels ${S2B_ARRAY[1]} --search_engine diamond --write_bins 1 --threads 4 --proteins gene_models.faa --debug
Error: DAS Tool

Usage:
  DAS_Tool [options] -i <contig2bin> -c <contigs_fasta> -o <outputbasename>
  DAS_Tool -i <contig2bin> -c <contigs_fasta> -o <outputbasename> [--labels=<labels>] [--proteins=<proteins_fasta>] [--threads=<threads>] [--search_engine=<search_engine>] [--score_threshold=<score_threshold>] [--dbDirectory=<dbDirectory> ] [--megabin_penalty=<megabin_penalty>] [--duplicate_penalty=<duplicate_penalty>] [--write_bin_evals] [--create_plots] [--write_bins] [--write_unbinned] [--resume] [--debug]
  DAS_Tool [--version]
  DAS_Tool [--help]

Options:
   -i --bins=<contig2bin>                   Comma separated list of tab separated contigs to bin tables.
   -c --contigs=<contigs>                   Contigs in fasta format.
   -o --outputbasename=<outputbasename>     Basename of output files.
   -l --labels=<labels>                     Comma separated list of binning prediction names.
   --search_engine=<search_engine>          Engine used for single copy gene identification (dia
Execution halted
cmks commented 2 years ago

The command line syntax has slightly changed in version 1.1.4. Instead of --write_bins 1 you would only use --write_bins. The following command should work:

DAS_Tool --bins ${S2B_ARRAY[0]} --contigs scaffolds.fasta --outputbasename dastool/_ --labels ${S2B_ARRAY[1]} --search_engine diamond --write_bins --threads 4 --proteins gene_models.faa --debug

The help message which is cutoff in the middle when the syntax is violated is an unfortunate bug in the current version of the docopt R package which is used to parse the command line parameters.

jolespin commented 2 years ago

Ok perfect, I got it to work. The resulting error was:

Error:  No single copy genes predicted
Execution halted

Is it safe to assume that this error indicates there are No single copy genes predicted? i.e.,

successfully finished
calculating contig lengths.
WARNING: Duplicated scaffolds in: SPAdes-MaxBin2-test_minigut_sample2.tsv 
Error in aggregate.data.frame(arc_scg["count"], by = arc_scg[c("Archaeal.SCG",  : 
  no rows to aggregate
Calls: cherry_pick -> aggregate -> aggregate.data.frame
Execution halted

The reason why I am digging so deep into the details is because I have this conda environment set up with all of the tools I need for a pipeline but for some reason I can't get DAS Tool v1.1.4 and CONCOCT ≥ 1.0 to work together for the life of me.

jolespin commented 2 years ago

I got the following dependency error when trying to get DAS Tool in my binning environment I'm working on for a publication.

Encountered problems while solving:

cmks commented 2 years ago

Ok perfect, I got it to work. The resulting error was:

Error:  No single copy genes predicted
Execution halted

Is it safe to assume that this error indicates there are No single copy genes predicted? i.e.,

successfully finished
calculating contig lengths.
WARNING: Duplicated scaffolds in: SPAdes-MaxBin2-test_minigut_sample2.tsv 
Error in aggregate.data.frame(arc_scg["count"], by = arc_scg[c("Archaeal.SCG",  : 
  no rows to aggregate
Calls: cherry_pick -> aggregate -> aggregate.data.frame
Execution halted

The reason why I am digging so deep into the details is because I have this conda environment set up with all of the tools I need for a pipeline but for some reason I can't get DAS Tool v1.1.4 and CONCOCT ≥ 1.0 to work together for the life of me.

You can check the *archaea.scg and *bacteria.scg tables in your output directory to see how many single copy genes were predicted for each marker gene set.

I got the following dependency error when trying to get DAS Tool in my binning environment I'm working on for a publication.

Encountered problems while solving:

* package das_tool-1.1.4-r41hdfd78af_0 requires r-base >=4.1,<4.2.0a0, but none of the providers can be installed
  Is the r-base limit necessary or is this coming from something else?

Are you using the bioconda version? The minimum version requirement was introduced by the maintainers of the bioconda package. If you install directly from github you probably can get away with using an older R version that is >= 3.2.3.