Closed DLuong79 closed 6 months ago
Have you checked that the number of lines in your read files are the same?
zcat read1.fastq.gz | wc -l zcat read2.fastq.gz | wc -l
If you don't mind letting me know which dataset it is I can try to download it and take a look to see if I can identify/reproduce the issue.
I downloaded the fastqs from this page: https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-9543/sdrf I only downloaded files from Patient 386C and Patient 432C (Sort by "Patient").
Is there a specific sample from those patients that is giving you the issue or are they all giving you an issue?
All of them except for ERR9924270 (transverse colon) fail to reach the end of the pipeline.
I will download one and try it. In the meantime have you verified that your fastq files are complete?
zcat r1.fastq.gz | wc -l
zcat r2.fastq.gz | wc -l
Should have the same number of lines.
You probably have incomplete or corrupted fastq files somewhere. I successfully downloaded and processed this sample:
34G May 9 15:42 Human_colon_16S8117828_S1_L001_R1_001.fastq.gz
82G May 9 19:49 Human_colon_16S8117828_S1_L001_R2_001.fastq.gz
[18:35:09] Done writing unmapped reads
[18:35:09] Done writing 1606475407 reads and 9802001 marked as discarded
[18:35:09] Deleting the temporary bam files
[18:35:13] Done
scsnv count -k ~/scsnv/data/737K-august-2016.txt -l V2_5P -o colon_test/barcode colon_test/run1
scsnv map -l V2_5P -i ~/ref/scsnv/scsnv -g ~/ref/genome.fa -b ./colon_test/barcode -t 24 --bam-write 4 -q 4 -c ~/ref/gene_groups.txt -o colon_test/ colon_test/run1
When I try to run the map step of my pipeline, I keep getting errors that prevent the merged.bam file from being created. The error seems to be different for each run, e.g.,
Data length wrong dl = 606 compared to 605
Error! Inconsistent number of reads between the read1 and read 2 files
error writing sam
I'm running this on a public dataset from the European Nucleotide Archive.