alexdobin / STAR

RNA-seq aligner
MIT License
1.83k stars 504 forks source link

Corrupt Aligned.out.bam (2.7.6a) #2147

Open btsherid opened 4 months ago

btsherid commented 4 months ago

Hi,

We are running a STAR process (version 2.7.6a) that sometimes saves 0-byte strings to the Aligned.out.bam. This then causes downstream processes to crash because that bam file is corrupt. I saw another GitHub issue that mentioned STAR writing 0-byte strings, so I am starting here in my troubleshooting process.

This is the STAR command that generated the corrupt bam file:

STAR   --runThreadN 8   --genomeDir STAR_2.7.6a   --readFilesIn MMTVPYVTUNTREAT7dDR208.merged.R1.fastq.gz   MMTVPYVTUNTREAT7dDR208.merged.R2.fastq.gz      --readFilesCommand zcat      --limitIObufferSize 150000000   --limitOutSJcollapsed 1000000   --outSAMtype BAM   Unsorted      --outSAMunmapped Within      --quantMode TranscriptomeSAM

Here is an portion of a corrupt Aligned.out.bam with the 0-byte strings:

hexdump Aligned.out.bam | grep -C 2 " 0000 0000 0000 "

87fbffe0 1ac7 118d 8617 050a 1942 1dfb 3bcf 56b8
87fbfff0 f33b 73f6 d8d3 e400 1db2 6f1a 639e 06be
87fc0000 0000 0000 0000 0000 0000 0000 0000 0000
*
87fda000 fb24 67e8 5db6 4b13 1b45 85a7 fd32 6c6c

Any help in figuring out why these files end up corrupt would be appreciated.

Thank You, Brendan Sheridan