metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
368 stars 97 forks source link

Error in rule run_das_tool for test #665

Closed XC-Zhai closed 1 year ago

XC-Zhai commented 1 year ago

Hello there, I have installed atlas2.16.3 with conda. And run the test samples: atlas run all -w test_reads -c test_reads/config.yaml -j 16 --conda-prefix databases/conda_envs --conda-base-path /home/projects/cu_10168/people/xiczha/bin. But I got error:

Error in rule run_das_tool: jobid: 80 input: sample2/binning/DASTool/metabat.scaffolds2bin, sample2/binning/DASTool/maxbin.scaffolds2bin, sample2/sample2_contigs.fasta, sample2/annotation/predicted_genes/sample2.faa output: sample2/binning/DASTool/sample2_DASTool_summary.tsv, sample2/binning/DASTool/sample2_allBins.eval, sample2/binning/DASTool/cluster_attribution.tsv log: sample2/logs/binning/DASTool.log (check log file(s) for error details) conda-env: /home/projects/cu_10168/people/xiczha/test_reads/databases/conda_envs/befc068fe3d186ebd33052dcba8516f0_ shell: DAS_Tool --outputbasename sample2/binning/DASTool/sample2 --bins sample2/binning/DASTool/metabat.scaffolds2bin,sample2/binning/DASTool/maxbin.scaffolds2bin --labels metabat,maxbin --contigs sample2/sample2_contigs.fasta --search_engine diamond --proteins sample2/annotation/predicted_genes/sample2.faa --write_bin_evals --megabin_penalty 0.5 --duplicate_penalty 0.6 --threads 16 --debug --score_threshold 0.5 &> sample2/logs/binning/DASTool.log ; mv sample2/binning/DASTool/sample2_DASTool_contig2bin.tsv sample2/binning/DASTool/cluster_attribution.tsv &>> sample2/logs/binning/DASTool.log (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Note the path to the log file for debugging. Documentation is available at: https://metagenome-atlas.readthedocs.io Issues can be raised at: https://github.com/metagenome-atlas/atlas/issues Complete log: .snakemake/log/2023-06-12T095030.927163.snakemake.log [Atlas] CRITICAL: Command 'snakemake --snakefile /home/projects/cu_10168/people/xiczha/atlas/atlas/workflow/Snakefile --directory /home/projects/cu_10168/people/xiczha/test_reads --rerun-triggers mtime --jobs 16 --rerun-incomplete --configfile '/home/projects/cu_10168/people/xiczha/test_reads/config.yaml' --nolock --use-conda --conda-prefix /home/projects/cu_10168/people/xiczha/databases/conda_envs --resources mem=1435 mem_mb=1470457 java_mem=1220 --scheduler greedy all --conda-prefix databases/conda_envs --conda-base-path /home/projects/cu_10168/people/xiczha/bin ' returned non-zero exit status 1. - [x] I checked the log files indicated indicated in the error message (and the cluster logs if submitted to a cluster)

Here is the relevant log output: DAS Tool 1.1.6 Analyzing assembly Skipping gene prediction using protein fasta file: sample2/annotation/predicted_genes/sample2.faa Annotating single copy genes using diamond Warning: No SCGs detected for SCG set: bacteria Warning: No SCGs detected for SCG set: archaea Error: No single copy genes predicted Execution halted

Atlas version atlas, version 2.16.3 Additional context I am new to snakemake, I hope you can guide me for finding the problem.

Best, Xichuan

SilasK commented 1 year ago

The pipline works as expected. But I don't know why you don't get any bins for the test data. Can you confirm that you get paired-end files? Which test-samples did you use excatly?

Could you run atlas run assemble and look at the assembly report?

XC-Zhai commented 1 year ago

The pipline works as expected. But I don't know why you don't get any bins for the test data. Can you confirm that you get paired-end files? Which test-samples did you use excatly?

Could you run atlas run assemble and look at the assembly report? I used the test data from Example Data: https://metagenome-atlas.readthedocs.io/en/latest/usage/getting_started.html#example-data

The assemble was finished as expected,I also have the bins from maxbin and metabat. But the error start from dastool, as I check the dastool.log, it shows as following:

Analyzing assembly Assembly stats: contigs size median min max N50 293 3467154 5046 1007 303598 25285

Skipping gene prediction using protein fasta file: sample2/annotation/predicted_genes/sample2.faa Annotating single copy genes using diamond No SCGs detected for SCG set: bacteria No SCGs detected for SCG set: archaea No single copy genes predicted

It should go for the dastool db, but stopped from above. I could run atlas at single machine, but I am trying to run it on HPC machine with iqsub. Having no idea where to debug.

Best, Xichuan

SilasK commented 1 year ago

I suggest you to usse metabat as final_binner for the test set. Try to run it on the cluster using e.g. the cluster profile.

Then try it on a real dataset and use vamb as final binner. ( the two samples of the test data are not enough for vamb).

I generally reccomend vamb over DASToool with < 150 samples.

XC-Zhai commented 1 year ago

I suggest you to usse metabat as final_binner for the test set. Try to run it on the cluster using e.g. the cluster profile.

Then try it on a real dataset and use vamb as final binner. ( the two samples of the test data are not enough for vamb).

I generally reccomend vamb over DASToool with < 150 samples.

Remove the ruby from the conda env where DAS_Tool is installed if running atlas at the public server could solve the problem, mentioned in #13. For a private server, there is no problem.

SilasK commented 1 year ago

Great, So you are using the Ruby installed on your server?