ablab / spades

SPAdes Genome Assembler
http://ablab.github.io/spades/
Other
738 stars 134 forks source link

OS return value: 21 #1379

Closed lyisrae1 closed 1 week ago

lyisrae1 commented 1 week ago

Description of bug

Hello,

I am trying to use spades for a metagenomic project. I have forward and reverse reads that input into SPAdes after using trimmomatic to trim. Sadly, SPAdes keeps filling the project because the forward and reverse fastq files do not have the same number of reads. Is there anything that can be done to bypass this, and still be able to use SPAdes?

P.S. I have plenty of RAM to use, and plenty of storage space for my files (50TB).

spades.log

== Warning == No assembly mode was specified! If you intend to assemble high-coverage multi-cell/isolate data, use '--isolate' option.

Command line: /data/apps/extern/SPAdes/3.15.5/bin/spades.py -t 68 -k 73,87,99,115 --careful -1 /vast/agnanad1/Leone/88_trim_R1.fastq -2 /vast/agnanad1/Leone/88_trim_R2.fastq -o /vast/agnanad1/Leone/88_2_assembly

System information: SPAdes version: 3.15.5 Python version: 3.6.8 OS: Linux-4.18.0-477.21.1.el8_8.x86_64-x86_64-with-centos-8.8-Green_Obsidian

Output dir: /vast/agnanad1/Leone/88_2_assembly Mode: read error correction and assembling Debug mode is turned OFF

Dataset parameters: Standard mode For multi-cell/isolate data we recommend to use '--isolate' option; for single-cell MDA data use '--sc'; for metagenomic data use '--meta'; for RNA-Seq use '--rna'. Reads: Library number: 1, library type: paired-end orientation: fr left reads: ['/vast/agnanad1/Leone/88_trim_R1.fastq'] right reads: ['/vast/agnanad1/Leone/88_trim_R2.fastq'] interlaced reads: not specified single reads: not specified merged reads: not specified Read error correction parameters: Iterations: 1 PHRED offset will be auto-detected Corrected reads will be compressed Assembly parameters: k: [73, 87, 99, 115] Repeat resolution is enabled Mismatch careful mode is turned ON MismatchCorrector will be used Coverage cutoff is turned OFF Other parameters: Dir for temp files: /vast/agnanad1/Leone/88_2_assembly/tmp Threads: 68 Memory limit (in Gb): 188

======= SPAdes pipeline started. Log can be found here: /vast/agnanad1/Leone/88_2_assembly/spades.log

/vast/agnanad1/Leone/88_trim_R1.fastq: max reads length: 120 /vast/agnanad1/Leone/88_trim_R2.fastq: max reads length: 120

Reads length: 120

===== Before start started.

===== Read error correction started.

===== Read error correction started.

== Running: /data/apps/extern/SPAdes/3.15.5/bin/spades-hammer /vast/agnanad1/Leone/88_2_assembly/corrected/configs/config.info

0:00:00.000 1M / 12M INFO General (main.cpp : 75) Starting BayesHammer, built from N/A, git revision N/A 0:00:00.004 1M / 12M INFO General (main.cpp : 76) Loading config from /vast/agnanad1/Leone/88_2_assembly/corrected/configs/config.info 0:00:00.005 1M / 12M INFO General (main.cpp : 78) Maximum # of threads to use (adjusted due to OMP capabilities): 48 0:00:00.006 1M / 12M INFO General (memory_limit.cpp : 54) Memory limit set to 188 Gb 0:00:00.006 1M / 12M INFO General (main.cpp : 86) Trying to determine PHRED offset 0:00:00.006 1M / 12M INFO General (main.cpp : 92) Determined value is 33 0:00:00.006 1M / 12M INFO General (hammer_tools.cpp : 38) Hamming graph threshold tau=1, k=21, subkmer positions = [ 0 10 ] 0:00:00.006 1M / 12M INFO General (main.cpp : 113) Size of aux. kmer data 24 bytes === ITERATION 0 begins === 0:00:00.011 1M / 12M INFO General (kmer_index_builder.hpp : 243) Splitting kmer instances into 16 files using 48 threads. This might take a while. 0:00:00.014 1M / 12M INFO General (file_limit.hpp : 42) Open file limit set to 131072 0:00:00.014 1M / 12M INFO General (kmer_splitter.hpp : 93) Memory available for splitting buffers: 1.30555 Gb 0:00:00.014 1M / 12M INFO General (kmer_splitter.hpp : 101) Using cell size of 4194304 0:00:00.291 28G / 28G INFO K-mer Splitting (kmer_data.cpp : 97) Processing /vast/agnanad1/Leone/88_trim_R1.fastq 0:01:07.608 28G / 40G INFO K-mer Splitting (kmer_data.cpp : 107) Processed 14486375 reads 0:01:20.811 28G / 40G INFO K-mer Splitting (kmer_data.cpp : 107) Processed 16935397 reads 0:01:20.815 28G / 40G INFO K-mer Splitting (kmer_data.cpp : 97) Processing /vast/agnanad1/Leone/88_trim_R2.fastq 0:02:25.113 28G / 40G INFO K-mer Splitting (kmer_data.cpp : 107) Processed 31055778 reads 0:02:35.852 28G / 40G INFO K-mer Splitting (kmer_data.cpp : 107) Processed 33063706 reads 0:02:35.860 28G / 40G INFO K-mer Splitting (kmer_data.cpp : 112) Total 33063706 reads processed 0:02:35.868 1M / 40G INFO General (kmer_index_builder.hpp : 249) Starting k-mer counting. 0:02:55.984 1M / 40G INFO General (kmer_index_builder.hpp : 260) K-mer counting done. There are 1885247910 kmers in total. 0:02:55.995 1M / 40G INFO K-mer Index Building (kmer_index_builder.hpp : 395) Building perfect hash indices 0:03:49.704 1377M / 40G INFO K-mer Index Building (kmer_index_builder.hpp : 431) Index built. Total 1885247910 kmers, 1361658720 bytes occupied (5.77816 bits per kmer). 0:03:49.707 1377M / 40G INFO K-mer Counting (kmer_data.cpp : 354) Arranging kmers in hash map order 0:04:49.671 30G / 40G INFO General (main.cpp : 148) Clustering Hamming graph. 0:12:01.353 30G / 40G INFO General (main.cpp : 155) Extracting clusters: 0:12:01.353 30G / 40G INFO General (concurrent_dsu.cpp : 18) Connecting to root 0:12:02.338 30G / 40G INFO General (concurrent_dsu.cpp : 34) Calculating counts 0:15:42.402 64G / 65G INFO General (concurrent_dsu.cpp : 63) Writing down entries 0:22:13.464 30G / 88G INFO General (main.cpp : 167) Clustering done. Total clusters: 1122289993 0:22:13.501 16G / 88G INFO K-mer Counting (kmer_data.cpp : 371) Collecting K-mer information, this takes a while. 0:22:24.754 58G / 88G INFO K-mer Counting (kmer_data.cpp : 377) Processing /vast/agnanad1/Leone/88_trim_R1.fastq 0:23:03.428 58G / 88G INFO K-mer Counting (kmer_data.cpp : 377) Processing /vast/agnanad1/Leone/88_trim_R2.fastq 0:23:39.929 58G / 88G INFO K-mer Counting (kmer_data.cpp : 384) Collection done, postprocessing. 0:23:46.304 58G / 88G INFO K-mer Counting (kmer_data.cpp : 398) There are 1885247910 kmers in total. Among them 834879414 (44.2849%) are singletons. 0:23:46.304 58G / 88G INFO General (main.cpp : 173) Subclustering Hamming graph 0:35:15.814 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 650) Subclustering done. Total 234841 non-read kmers were generated. 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 651) Subclustering statistics: 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 652) Total singleton hamming clusters: 1012414150. Among them 893994840 (88.3033%) are good 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 653) Total singleton subclusters: 4405881. Among them 4399268 (99.8499%) are good 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 654) Total non-singleton subcluster centers: 128812179. Among them 119610759 (92.8567%) are good 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 655) Average size of non-trivial subcluster: 6.78496 kmers 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 656) Average number of sub-clusters per non-singleton cluster: 1.21244 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 657) Total solid k-mers: 1018004867 0:35:15.815 58G / 88G INFO Hamming Subclustering (kmer_cluster.cpp : 658) Substitution probabilities: 4,4 0:35:16.462 58G / 88G INFO General (main.cpp : 178) Finished clustering. 0:35:16.462 58G / 88G INFO General (main.cpp : 197) Starting solid k-mers expansion in 48 threads. 0:37:05.225 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 0 produced 246058124 new k-mers. 0:38:03.725 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 1 produced 37733796 new k-mers. 0:38:56.272 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 2 produced 3415820 new k-mers. 0:39:48.468 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 3 produced 298686 new k-mers. 0:40:40.084 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 4 produced 24741 new k-mers. 0:41:30.864 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 5 produced 2643 new k-mers. 0:42:19.555 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 6 produced 164 new k-mers. 0:43:09.246 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 7 produced 15 new k-mers. 0:44:00.260 58G / 88G INFO General (main.cpp : 218) Solid k-mers iteration 8 produced 0 new k-mers. 0:44:00.260 58G / 88G INFO General (main.cpp : 222) Solid k-mers finalized 0:44:00.261 58G / 88G INFO General (hammer_tools.cpp : 222) Starting read correction in 48 threads. 0:44:00.261 58G / 88G INFO General (hammer_tools.cpp : 235) Correcting pair of reads: /vast/agnanad1/Leone/88_trim_R1.fastq and /vast/agnanad1/Leone/88_trim_R2.fastq 0:44:09.006 62G / 88G INFO General (hammer_tools.cpp : 170) Prepared batch 0 of 4800000 reads. 0:44:22.620 63G / 88G INFO General (hammer_tools.cpp : 177) Processed batch 0 0:44:27.665 63G / 88G INFO General (hammer_tools.cpp : 187) Written batch 0 0:44:36.060 63G / 88G INFO General (hammer_tools.cpp : 170) Prepared batch 1 of 4800000 reads. 0:44:49.772 63G / 88G INFO General (hammer_tools.cpp : 177) Processed batch 1 0:44:54.630 63G / 88G INFO General (hammer_tools.cpp : 187) Written batch 1 0:45:02.717 63G / 88G INFO General (hammer_tools.cpp : 170) Prepared batch 2 of 4800000 reads. 0:45:17.317 63G / 88G INFO General (hammer_tools.cpp : 177) Processed batch 2 0:45:22.559 63G / 88G INFO General (hammer_tools.cpp : 187) Written batch 2 0:45:25.378 63G / 88G INFO General (hammer_tools.cpp : 170) Prepared batch 3 of 1728309 reads. 0:45:30.302 63G / 88G INFO General (hammer_tools.cpp : 177) Processed batch 3 0:45:32.098 63G / 88G INFO General (hammer_tools.cpp : 187) Written batch 3 0:45:32.102 63G / 88G ERROR General (hammer_tools.cpp : 191) Pair of read files /vast/agnanad1/Leone/88_trim_R1.fastq and /vast/agnanad1/Leone/88_trim_R2.fastq contain unequal amount of reads

== Error == system call for: "['/data/apps/extern/SPAdes/3.15.5/bin/spades-hammer', '/vast/agnanad1/Leone/88_2_assembly/corrected/configs/config.info']" finished abnormally, OS return value: 21 None

In case you have troubles running SPAdes, you can write to spades.support@cab.spbu.ru or report an issue on our GitHub repository github.com/ablab/spades Please provide us with params.txt and spades.log files from the output directory.

SPAdes log can be found here: /vast/agnanad1/Leone/88_2_assembly/spades.log

Thank you for using SPAdes!

params.txt

Included in logfile data

SPAdes version

SPAdes version: 3.15.5

Operating System

OS: Linux-4.18.0-477.21.1.el8_8.x86_64-x86_64-with-centos-8.8-Green_Obsidian

Python Version

Python version: 3.6.8

Method of SPAdes installation

Already filled in on my university's cluster

No errors reported in spades.log

asl commented 1 week ago

The log clearly reads:

0:45:32.102 63G / 88G ERROR General (hammer_tools.cpp : 191) Pair of read files /vast/agnanad1/Leone/88_trim_R1.fastq and /vast/agnanad1/Leone/88_trim_R2.fastq contain unequal amount of reads

So, your input reads are corrupted. If you are using any read pre-processing, ensure it is paired-end aware