Closed marinachen closed 2 months ago
My short reads are merged into a single fastq after QC pipeline, consisting of both paired and orphan reads.
Don't do that. You need to provide proper paired-end dataset: either interleaved or in two separate files.
Sorry, current version of metaSPAdes can work either with single library (paired-end only) or in hybrid paired-end + (TSLR or PacBio or Nanopore) mode.
This is expected. You need to have a paired-end library, not a single-end one (-s
).
See https://ablab.github.io/spades/input.html#paired-read-libraries for more information
Hi, thank you for a quick reply! Do you mean I have to use either --pe1-1 R1.fastq --pe1-2 R2.fastq
or --interleaved
? WRT to the former, would both files have to be exact the same number of reads and exactly paired? Because some of my reads lost mates for quality or contamination during QC. Thank you again!
Yes, you need to have a proper paired-end dataset (left reads correspond to right ones). SPAdes has no idea how to figure out which reads lost their mates. In general, you need to use paired-end aware QC procedure
Thank you very much! Would you recommend just filtering out reads to retain only paired ones in this case?
Thank you very much! Would you recommend just filtering out reads to retain only paired ones in this case?
up to you. If you know how to do this reliably.
Okay thank you so much for your help!
Description of bug
Hi, I was running hybrid assembly with Illumina short reads + PacBio ccs. My short reads are merged into a single fastq after QC pipeline, consisting of both paired and orphan reads, so I specified -s because that was the only way it could run. It was running fine until it had the below error when SPAdes started to first run k21 assembling.
The command I was running was:
/n/home13/marinachen/.conda/envs/spades/bin/spades.py --meta --pacbio /n/holystore01/LABS/huttenhower_lab/Users/mchen/data/PB_MGX/use/Soil_pool.hifi_reads.fastq.gz -s /n/holystore01/LABS/huttenhower_lab/Users/mchen/data/Illumina_MGX_forPB/Clean_data/Soil_Pool_S52_L001.fastq -o /n/holystore01/LABS/huttenhower_lab/Users/mchen/outputs/PB_MGX/hybrid_assembly/Soil_hybrid
And the error message was:
It looked like there was an issue with recognizing the input files for hybrid paired-end + PacBio mode? Thank you for any help!
spades.log
spades.log
params.txt
params.txt
SPAdes version
4.0.0
Operating System
Cannon HPC (linux)
Python Version
3.10.9
Method of SPAdes installation
conda
No errors reported in spades.log