alexdobin / STAR

RNA-seq aligner
MIT License
1.85k stars 506 forks source link

Problem with generating BAM file #1292

Open Gotumbtai opened 3 years ago

Gotumbtai commented 3 years ago

Hi Alex,

I got a problem while generating BAM files using STAR 2.7.9a. The error message was following: EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: Expected bin size=9792429096 ; size on disk=0 ; bin number=46

The commands I used are following: for ID in cat list.txt do mkdir star_${ID} star --runThreadN 4 \ --limitBAMsortRAM 31000000000 \ --genomeDir genome \ --outFileNamePrefix ${ID}_ \ --quantMode TranscriptomeSAM \ --outSAMtype BAM SortedByCoordinate \ --readFilesIn ${ID}_trim_pair_1.fastq.gz ${ID}_trim_pair_2.fastq.gz \ --readFilesCommand gunzip -c done

There was actually enough space to run them (>400 GB). I also did ulimit -n 10000. I successfully completed the mapping with test fastq.gz files which were small files before. Could you please help me correct the error? Thank you so much!

Gotumbtai commented 3 years ago

I increased the number of genome bins for BAM sorting with an option --outBAMsortingBinsN 60 and STAR completed mapping for the first set of fastq.gz files. However, the same error occurred when running for the second set of files. So, I further increased the bin number, but a different error has occurred as following: BAMoutput.cpp:27:BAMoutput: exiting because of *OUTPUT FILE* error: could not create output file XXXXXX_STARtmp//BAMsort/3/52 SOLUTION: check that the path exists and you have write permission for this file. Also check ulimit -n and increase it to allow more open files. Thank you again for your help.

alexdobin commented 3 years ago

Hi @Gotumbtai

I think you are running out of space, 400GB may not be enough. The bin is ~10GB , and you have at least 46 bins, so the total size is >400GB.

Cheers Alex

Gotumbtai commented 3 years ago

Hi Alex,

Thank you for your response. I tried to run the same thing with large space like >5TB available. But the same error occurred. EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: Expected bin size=12270787284 ; size on disk=0 ; bin number=57 Do you have any other ideas to handle this issue?

Also, I am wondering if the Aligned.toTranscriptome.out.bam file is the same even if I chose --outSAMtype BAM unsorted instead of --outSAMtype BAM SortedByCoordinate.

Thank you again for your help.

alexdobin commented 3 years ago

Hi @Gotumbtai

the next thing to try is to increase
ulimit -n 5000 or even ulimit -n 10000.

Aligned.toTranscriptome.out.bam does not depend on the normal BAM sorting, so you can output unsorted BAM (and sort it later with samtools if needed), or even switch off normal BAM output.

Cheers Alex

Gotumbtai commented 3 years ago

Thank you for your advice. I tried ulimit -n 10000, but the issue has not been resolved. I will use the unsorted option and samtools at this moment. I would appreciate it if you provide further ideas to solve the issue when you have time.

Best regards, Gotumbtai

alexdobin commented 3 years ago

Hi @Gotumbtai

please send me the Log.out file of the failed run.

Cheers Alex

Gotumbtai commented 3 years ago

Hi Alex, I just sent the file to you. Thank you so much.

Gotumbtai commented 3 years ago

Hi Alex, I am attaching the file here too. Thank you for your help. Log.out.txt

alexdobin commented 3 years ago

Hi @Gotumbtai

nothing suspicious in the Log.out file. It seems like the FASTQ files are already pre-sorted (generated from a sorted BAM?) file, which requires more RAM for sorting, but still should work fine. What's the output of ulimit -n? It would probably be easier to output unsorted BAM and then sort it with samtools sort.

Cheers Alex

Gotumbtai commented 3 years ago

Hi Alex, I think the fastq was not sorted before running STAR. I only processed it using Trimmomatic for adaptor and low quality trimming. I tried ulimit -n 10000 but the same issue occurred.

I will do it with samtools as you suggested at this time.

Thanks a lot.

njbowen commented 2 years ago

did the unsorted BAM option work? i'm having same issues, plenty of space, wondering if it's an issue with my external drive on mac being formatted as APFS. thanks. btw, it's seeing since on disk =0 EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: Expected bin size=74576650820 ; size on disk=0 ; bin number=25

njbowen commented 2 years ago

i rolled back to 2.7.9a from 2.7.10a_alpha_XX, got same issue here are my logout files SRR8618305_RNAseq_of_VCAP_PROSTATE_Log copy.out.txt SRR8618305_RNAseq_of_VCAP_PROSTATE_Log.progress copy.out.txt i have 4.7 TB of disk space

alexdobin commented 2 years ago

It could be a problem with writing on the drive.

DawnEve commented 2 years ago

I am having the same trouble:

Oct 23 19:55:22 ..... started sorting BAM

EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: 18871744046   0   13

Oct 23 19:55:23 ...... FATAL ERROR, exiting
EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: 18874425027   0   16

Oct 23 19:55:23 ...... FATAL ERROR, exiting
*** Error in `STAR': double free or corruption (!prev): 0x0000000001a5e290 ***

I have 16T Avail disk space

alexdobin commented 1 year ago

Hi @DawnEve

Since you are seeing multiple ERROR/EXIT messages, it seems you are running several STAR jobs from the same directory at the same time.

DawnEve commented 1 year ago

No, I only run one STAR cmd from shell of my own username

$ STAR --runThreadN 150  \
--outSAMtype BAM SortedByCoordinate  \
--genomeDir /home/wangjl/data/ref/hg19_mm10_transgenes/starIndex  \
--readFilesIn /data/jinwf/wangjl/ref/293T/v3.1/fastq/20k_hgmm_3p_HT_nextgem_Chromium_X_fastqs/R2.fastq.gz  \
--readFilesCommand zcat \
--outFileNamePrefix  /data/jinwf/wangjl/ref/293T/v3.1/bam/hg19_mm10_

Then

...
Oct 23 19:55:22 ..... started sorting BAM

EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: 18871744046   0   13

Oct 23 19:55:23 ...... FATAL ERROR, exiting
EXITING because of FATAL ERROR: number of bytes expected from the BAM bin does not agree with the actual size on disk: 18874425027   0   16

Oct 23 19:55:23 ...... FATAL ERROR, exiting
*** Error in `STAR': double free or corruption (!prev): 0x0000000001a5e290 ***
alexdobin commented 1 year ago

Please try to use the latest version of (2.7.10b). If this does not help, you can run STAR without BAM sorting option, and then run samtools sort command.