metagenome-atlas / atlas

ATLAS - Three commands to start analyzing your metagenome data
https://metagenome-atlas.github.io/
BSD 3-Clause "New" or "Revised" License
364 stars 97 forks source link

Error in rule run_vamb #693

Closed AroArz closed 11 months ago

AroArz commented 11 months ago

Snakemake log


Error in rule run_vamb:
    jobid: 3313
    input: Intermediate/cobinning/All/coverage.tsv, Intermediate/cobinning/All/combined_contigs.fasta.gz
    output: Intermediate/cobinning/All/vamb_output
    log: logs/cobinning/run_vamb/All.log (check log file(s) for error details)
    conda-env: /crex/proj/snic2020-6-233/projects/06_JJ/06_JJ_metagenomics/metagenome-atlas/conda_envs/5cd82fcde6b7796fbe3744a91a0eea5d_
    shell:
        vamb --outdir Intermediate/cobinning/All/vamb_output  -m 2000  --minfasta 200000  -o ':'  --jgi Intermediate/cobinning/All/coverage.tsv  --fasta Intermediate/cobinning/All/combined_contigs.fasta.gz 2> logs/cobinning/run_vamb/All.log
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
--
Error executing rule run_vamb on cluster (jobid: 3313, external: 40659053, jobscript: /crex/proj/snic2020-6-233/projects/06_JJ/06_JJ_metagenomics/metagenome-atlas/.snakemake/tmp.2c914620/snakejob.run_vamb.3313.sh). For error details see the cluster log and the log files of the involved rule(s).
Terminating processes on user request, this might take some time.

logs/cobinning/run_vamb/All.log

Traceback (most recent call last):
  File "/crex/proj/snic2020-6-233/projects/06_JJ/06_JJ_metagenomics/metagenome-atlas/conda_envs/5cd82fcde6b7796fbe3744a91a0eea5d_/bin/vamb", line 11, in <module>
    sys.exit(main())
  File "/crex/proj/snic2020-6-233/projects/06_JJ/06_JJ_metagenomics/metagenome-atlas/conda_envs/5cd82fcde6b7796fbe3744a91a0eea5d_/lib/python3.7/site-packages/vamb/__main__.py", line 528, in main
    logfile=logfile)
  File "/crex/proj/snic2020-6-233/projects/06_JJ/06_JJ_metagenomics/metagenome-atlas/conda_envs/5cd82fcde6b7796fbe3744a91a0eea5d_/lib/python3.7/site-packages/vamb/__main__.py", line 247, in run
    len(tnfs), minalignscore, minid, subprocesses, logfile)
  File "/crex/proj/snic2020-6-233/projects/06_JJ/06_JJ_metagenomics/metagenome-atlas/conda_envs/5cd82fcde6b7796fbe3744a91a0eea5d_/lib/python3.7/site-packages/vamb/__main__.py", line 121, in calc_rpkm
    raise ValueError("Length of TNFs and length of RPKM does not match. Verify the inputs")
ValueError: Length of TNFs and length of RPKM does not match. Verify the inputs

Atlas version 2.18.0

Based on https://github.com/RasmussenLab/vamb/issues/65 It seems this issue might be related to missmatch in the number of RPKMs in Intermediate/cobinning/All/coverage.tsv and contigs in Intermediate/cobinning/All/combined_contigs.fasta.gz.

$: wc -l Intermediate/cobinning/All/coverage.tsv
165077 Intermediate/cobinning/All/coverage.tsv
$: zcat Intermediate/cobinning/All/combined_contigs.fasta.gz | grep ">" | wc -l
242098

They suggested some manual handling of this but I decided to post here instead, any input appreciated.

AroArz commented 11 months ago

I removed Intermediate/cobinning/All/coverage.tsv and resumed the pipeline to regenerate it and the problem seems to have solved itself. Pipeline is still tugging along but run_vamb was executed without errors and new log file is empty. I suspect earlier crashes (due to tweaking cluster settings) might have not have been successfully wiped inbetween and somehow interfered with this run. Feel free to close this issue if you feel this has been resolved Silas.

SilasK commented 11 months ago

Thank you. Feel free to write back if you have other issues.