alexdobin / STAR

RNA-seq aligner
MIT License
1.83k stars 504 forks source link

Corrupt Aligned.out.bam (2.7.6a) #2147

Open btsherid opened 4 months ago

btsherid commented 4 months ago


We are running a STAR process (version 2.7.6a) that sometimes saves 0-byte strings to the Aligned.out.bam. This then causes downstream processes to crash because that bam file is corrupt. I saw another GitHub issue that mentioned STAR writing 0-byte strings, so I am starting here in my troubleshooting process.

This is the STAR command that generated the corrupt bam file:

STAR   --runThreadN 8   --genomeDir STAR_2.7.6a   --readFilesIn MMTVPYVTUNTREAT7dDR208.merged.R1.fastq.gz   MMTVPYVTUNTREAT7dDR208.merged.R2.fastq.gz      --readFilesCommand zcat      --limitIObufferSize 150000000   --limitOutSJcollapsed 1000000   --outSAMtype BAM   Unsorted      --outSAMunmapped Within      --quantMode TranscriptomeSAM

Here is an portion of a corrupt Aligned.out.bam with the 0-byte strings:

hexdump Aligned.out.bam | grep -C 2 " 0000 0000 0000 "

87fbffe0 1ac7 118d 8617 050a 1942 1dfb 3bcf 56b8
87fbfff0 f33b 73f6 d8d3 e400 1db2 6f1a 639e 06be
87fc0000 0000 0000 0000 0000 0000 0000 0000 0000
87fda000 fb24 67e8 5db6 4b13 1b45 85a7 fd32 6c6c

Any help in figuring out why these files end up corrupt would be appreciated.

Thank You, Brendan Sheridan