alexdobin / STAR

RNA-seq aligner
MIT License
1.77k stars 495 forks source link

STARsolo: Segmentation fault (core dumped) after "Started mapping" #2144

Open reliscu opened 1 month ago

reliscu commented 1 month ago

My OS/architecture info: Linux compute2 4.15.0-141-generic #145-Ubuntu SMP Wed Mar 24 18:08:07 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

I've tried running the following command with STAR 2.7.11b and 2.7.11a.

STAR \
  --runThreadN 1 \
  --genomeDir UCSC/panTro6/STAR_50_index \
  --readFilesIn SM-GE4WN_E1-50_CTCTTCAGCA-GTAGAGACAA.R1_trimmed_polyA.fastq SM-GE4WN_E1-50_CTCTTCAGCA-GTAGAGACAA.R2_trimmed_polyA.fastq \
  --soloType SmartSeq \
  --soloStrand Unstranded \
  --readMapNumber 100000 \
  --soloUMIdedup Exact 

Here is the error I get: anaconda3/envs/star/bin/STAR: line 8: 31602 Segmentation fault (core dumped) "${cmd}" "$@"

I don't think it's a memory issue, as I tried running it on a server with > 500GB memory.

Here is output in the Log.out file before it crashes:

Genome: size given as a parameter = 4041016906
SA: size given as a parameter = 11773204867
SAindex: size given as a parameter = 1
Read from SAindex: pGe.gSAindexNbases=14  nSAi=357913940
nGenome=4041016906;  nSAbyte=11773204867
GstrandBit=32   SA number of indices=2854110270
Shared memory is not used for genomes. Allocated a private copy of the genome.
Genome file size: 4041016906 bytes; state: good=1 eof=0 fail=0 bad=0
Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 4041016906 bytes
SA file size: 11773204867 bytes; state: good=1 eof=0 fail=0 bad=0
Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 11773204867 bytes
Loading SAindex ... done: 1565873619 bytes
Finished loading the genome: Tue May 21 15:11:09 2024

Processing splice junctions database sjdbN=270766,   pGe.sjdbOverhang=49 
alignIntronMax=alignMatesGapMax=0, the max intron size will be approximately determined by (2^winBinNbits)*winAnchorDistNbins=589824
Loaded transcript database, nTr=102471
Loaded exon database, nEx=1259838

Notably, the command works without the STARsolo parameters. However, my ultimate goal is to run this command with hundreds of fastq files (each representing a cell) to generate a single output.

Any insight would be greatly appreciated!