alexdobin / STAR

RNA-seq aligner
MIT License
1.83k stars 503 forks source link

segfault on mapping start [STAR-2.7.10b] #1940

Closed ThaliaChaiZhang closed 11 months ago

ThaliaChaiZhang commented 1 year ago

Hello Alex,

I am experiencing a segfault error with mapping fq files as indicated below.

$ STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --readFilesCommand zcat --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu --outSAMtype BAM SortedByCoordinate 2> /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/logs/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_align.err STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --readFilesCommand zcat --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu --outSAMtype BAM SortedByCoordinate STAR version: 2.7.10b compiled: 2022-11-01T09:53:26-04:00 :/home/dobin/data/STAR/STARcode/STAR.master/source Sep 07 03:52:15 ..... started STAR run Sep 07 03:52:15 ..... loading genome Sep 07 03:53:49 ..... started mapping Segmentation fault (core dumped)

This same error persists even when using the compiled 2.7.11a linux version, as well as trying compiled versions 2.7.10a, 2.7.9a and 2.7.4a in my environment. ulimit has been set to unlimited, and there is 32 GB of ram available, as well as attempted to change the number of threads used to various amounts from 1 to 20.

Logfile reads:

STAR version=2.7.10b
STAR compilation time,server,dir=2022-11-01T09:53:26-04:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
STAR git: On branch master ; commit c6f8efc2c7043ef83bf8b0d9bed36bbb6b9b1133 ; diff files: CHANGES.md 
##### Command Line:
STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --readFilesCommand zcat --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu --outSAMtype BAM SortedByCoordinate
##### Initial USER parameters from Command Line:
outFileNamePrefix                 /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu
###### All USER parameters from Command Line:
genomeDir                     /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF     ~RE-DEFINED
runThreadN                    16     ~RE-DEFINED
readFilesCommand              zcat        ~RE-DEFINED
outFilterScoreMinOverLread    0.25     ~RE-DEFINED
outFilterMatchNminOverLread   0.25     ~RE-DEFINED
outFilterMultimapNmax         5     ~RE-DEFINED
readFilesIn                   /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz   /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz        ~RE-DEFINED
outFileNamePrefix             /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu     ~RE-DEFINED
outSAMtype                    BAM   SortedByCoordinate        ~RE-DEFINED
##### Finished reading parameters from all sources

##### Final user re-defined parameters-----------------:
runThreadN                        16
genomeDir                         /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF
readFilesIn                       /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz   /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz   
readFilesCommand                  zcat   
outFileNamePrefix                 /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu
outSAMtype                        BAM   SortedByCoordinate   
outFilterMultimapNmax             5
outFilterScoreMinOverLread        0.25
outFilterMatchNminOverLread       0.25

-------------------------------
##### Final effective command line:
STAR   --runThreadN 16   --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF   --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz   /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz      --readFilesCommand zcat      --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu   --outSAMtype BAM   SortedByCoordinate      --outFilterMultimapNmax 5   --outFilterScoreMinOverLread 0.25   --outFilterMatchNminOverLread 0.25
----------------------------------------

Number of fastq files for each mate = 1

   Input read files for mate 1 :
-rw-r--r-- 1 tchai tchai 1386624406 Sep  4 03:43 /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz

   readsCommandsFile:
exec > "/home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_STARtmp/tmp.fifo.read1"
echo FILE 0
zcat      "/home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz"

   Input read files for mate 2 :
-rw-r--r-- 1 tchai tchai 1516756414 Sep  4 03:43 /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz

   readsCommandsFile:
exec > "/home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_STARtmp/tmp.fifo.read2"
echo FILE 0
zcat      "/home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz"

ParametersSolo: --soloCellFilterType CellRanger2.2 filtering parameters:  3000 0.99 10
WARNING: --limitBAMsortRAM=0, will use genome size as RAM limit for BAM sorting
Finished loading and checking parameters
Reading genome generation parameters:
### STAR   --runMode genomeGenerate      --runThreadN 39   --genomeDir ./GRCm38_chrFasta_wGTF   --genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa      --sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf   --sjdbOverhang 149
### GstrandBit=32
versionGenome                 2.7.4a     ~RE-DEFINED
genomeType                    Full     ~RE-DEFINED
genomeFastaFiles              /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa        ~RE-DEFINED
genomeSAindexNbases           14     ~RE-DEFINED
genomeChrBinNbits             18     ~RE-DEFINED
genomeSAsparseD               1     ~RE-DEFINED
genomeTransformType           None     ~RE-DEFINED
genomeTransformVCF            -     ~RE-DEFINED
sjdbOverhang                  149     ~RE-DEFINED
sjdbFileChrStartEnd           -        ~RE-DEFINED
sjdbGTFfile                   /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf     ~RE-DEFINED
sjdbGTFchrPrefix              -     ~RE-DEFINED
sjdbGTFfeatureExon            exon     ~RE-DEFINED
sjdbGTFtagExonParentTranscripttranscript_id     ~RE-DEFINED
sjdbGTFtagExonParentGene      gene_id     ~RE-DEFINED
sjdbInsertSave                Basic     ~RE-DEFINED
genomeFileSizes               2821362240   22544256458        ~RE-DEFINED
Genome version is compatible with current STAR
Number of real (reference) chromosomes= 66
1   chr1    195471971   0
2   chr10   130694993   195559424
3   chr11   122082543   326369280
4   chr12   120129022   448528384
5   chr13   120421639   568852480
6   chr14   124902244   689438720
7   chr15   104043685   814481408
8   chr16   98207768    918552576
9   chr17   94987271    1016856576
10  chr18   90702639    1112014848
11  chr19   61431566    1202978816
12  chr2    182113224   1264582656
13  chr3    160039680   1446772736
14  chr4    156508116   1606942720
15  chr5    151834684   1763704832
16  chr6    149736546   1915748352
17  chr7    145441459   2065694720
18  chr8    129401213   2211184640
19  chr9    124595110   2340683776
20  chrMT   16299   2465464320
21  chrX    171031299   2465726464
22  chrY    91744698    2636906496
23  chrJH584299.1   953012  2728656896
24  chrGL456233.1   336933  2729705472
25  chrJH584301.1   259875  2730229760
26  chrGL456211.1   241735  2730491904
27  chrGL456350.1   227966  2730754048
28  chrJH584293.1   207968  2731016192
29  chrGL456221.1   206961  2731278336
30  chrJH584297.1   205776  2731540480
31  chrJH584296.1   199368  2731802624
32  chrGL456354.1   195993  2732064768
33  chrJH584294.1   191905  2732326912
34  chrJH584298.1   184189  2732589056
35  chrJH584300.1   182347  2732851200
36  chrGL456219.1   175968  2733113344
37  chrGL456210.1   169725  2733375488
38  chrJH584303.1   158099  2733637632
39  chrJH584302.1   155838  2733899776
40  chrGL456212.1   153618  2734161920
41  chrJH584304.1   114452  2734424064
42  chrGL456379.1   72385   2734686208
43  chrGL456216.1   66673   2734948352
44  chrGL456393.1   55711   2735210496
45  chrGL456366.1   47073   2735472640
46  chrGL456367.1   42057   2735734784
47  chrGL456239.1   40056   2735996928
48  chrGL456213.1   39340   2736259072
49  chrGL456383.1   38659   2736521216
50  chrGL456385.1   35240   2736783360
51  chrGL456360.1   31704   2737045504
52  chrGL456378.1   31602   2737307648
53  chrGL456389.1   28772   2737569792
54  chrGL456372.1   28664   2737831936
55  chrGL456370.1   26764   2738094080
56  chrGL456381.1   25871   2738356224
57  chrGL456387.1   24685   2738618368
58  chrGL456390.1   24668   2738880512
59  chrGL456394.1   24323   2739142656
60  chrGL456392.1   23629   2739404800
61  chrGL456382.1   23158   2739666944
62  chrGL456359.1   22974   2739929088
63  chrGL456396.1   21240   2740191232
64  chrGL456368.1   20208   2740453376
65  chrJH584292.1   14945   2740715520
66  chrJH584295.1   1976    2740977664
--sjdbOverhang = 149 taken from the generated genome
Started loading the genome: Thu Sep  7 03:52:15 2023

Genome: size given as a parameter = 2821362240
SA: size given as a parameter = 22544256458
SAindex: size given as a parameter = 1
Read from SAindex: pGe.gSAindexNbases=14  nSAi=357913940
nGenome=2821362240;  nSAbyte=22544256458
GstrandBit=32   SA number of indices=5465274292
Shared memory is not used for genomes. Allocated a private copy of the genome.
Genome file size: 2821362240 bytes; state: good=1 eof=0 fail=0 bad=0
Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 2821362240 bytes
SA file size: 22544256458 bytes; state: good=1 eof=0 fail=0 bad=0
Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 22544256458 bytes
Loading SAindex ... done: 1565873619 bytes
Finished loading the genome: Thu Sep  7 03:53:49 2023

Processing splice junctions database sjdbN=267968,   pGe.sjdbOverhang=149 
alignIntronMax=alignMatesGapMax=0, the max intron size will be approximately determined by (2^winBinNbits)*winAnchorDistNbins=589824
Created thread # 1
Created thread # 2
Created thread # 3
Created thread # 4
Created thread # 5
Created thread # 6
Created thread # 7
Created thread # 8
Created thread # 9
Created thread # 10
Starting to map file # 0
mate 1:   /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R1_deduped.fq.gz
mate 2:   /home/tchai/tools/joji_vessel_rnaseq/02_processed_fq/01_deduped_fq/EC1-1_RefAneTher_rnaseq_joji_R2_deduped.fq.gz
Created thread # 11
Created thread # 12
Created thread # 13
Created thread # 14
Created thread # 15

Any thoughts? Thanks.

alexdobin commented 1 year ago

Hi @ThaliaChaiZhang

there is nothing suspicious in the Log.out file. If you pre-processed the files before mapping, I would recommend mapping unprocessed files, in case there is an issue with pre-processing.

ThaliaChaiZhang commented 1 year ago

Hi Alex, I gave trying unprocessed files a try, and so far I seem to getting the same exact error. A copy of a log is provided below:

STAR version=2.7.10b
STAR compilation time,server,dir=2022-11-01T09:53:26-04:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
STAR git: On branch master ; commit c6f8efc2c7043ef83bf8b0d9bed36bbb6b9b1133 ; diff files: CHANGES.md 
##### Command Line:
STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --readFilesCommand zcat --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu --outSAMtype BAM SortedByCoordinate
##### Initial USER parameters from Command Line:
outFileNamePrefix                 /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu
###### All USER parameters from Command Line:
genomeDir                     /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF     ~RE-DEFINED
runThreadN                    16     ~RE-DEFINED
readFilesCommand              zcat        ~RE-DEFINED
outFilterScoreMinOverLread    0.25     ~RE-DEFINED
outFilterMatchNminOverLread   0.25     ~RE-DEFINED
outFilterMultimapNmax         5     ~RE-DEFINED
readFilesIn                   /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz   /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz        ~RE-DEFINED
outFileNamePrefix             /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu     ~RE-DEFINED
outSAMtype                    BAM   SortedByCoordinate        ~RE-DEFINED
##### Finished reading parameters from all sources

##### Final user re-defined parameters-----------------:
runThreadN                        16
genomeDir                         /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF
readFilesIn                       /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz   /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz   
readFilesCommand                  zcat   
outFileNamePrefix                 /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu
outSAMtype                        BAM   SortedByCoordinate   
outFilterMultimapNmax             5
outFilterScoreMinOverLread        0.25
outFilterMatchNminOverLread       0.25

-------------------------------
##### Final effective command line:
STAR   --runThreadN 16   --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF   --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz   /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz      --readFilesCommand zcat      --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu   --outSAMtype BAM   SortedByCoordinate      --outFilterMultimapNmax 5   --outFilterScoreMinOverLread 0.25   --outFilterMatchNminOverLread 0.25
----------------------------------------

Number of fastq files for each mate = 1

   Input read files for mate 1 :
-rw-r--r-- 1 tchai tchai 3540342657 Feb 25  2021 /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz

   readsCommandsFile:
exec > "/home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_STARtmp/tmp.fifo.read1"
echo FILE 0
zcat      "/home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz"

   Input read files for mate 2 :
-rw-r--r-- 1 tchai tchai 3535271982 Feb 25  2021 /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz

   readsCommandsFile:
exec > "/home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_STARtmp/tmp.fifo.read2"
echo FILE 0
zcat      "/home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz"

ParametersSolo: --soloCellFilterType CellRanger2.2 filtering parameters:  3000 0.99 10
WARNING: --limitBAMsortRAM=0, will use genome size as RAM limit for BAM sorting
Finished loading and checking parameters
Reading genome generation parameters:
### STAR   --runMode genomeGenerate      --runThreadN 39   --genomeDir ./GRCm38_chrFasta_wGTF   --genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa      --sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf   --sjdbOverhang 149
### GstrandBit=32
versionGenome                 2.7.4a     ~RE-DEFINED
genomeType                    Full     ~RE-DEFINED
genomeFastaFiles              /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa        ~RE-DEFINED
genomeSAindexNbases           14     ~RE-DEFINED
genomeChrBinNbits             18     ~RE-DEFINED
genomeSAsparseD               1     ~RE-DEFINED
genomeTransformType           None     ~RE-DEFINED
genomeTransformVCF            -     ~RE-DEFINED
sjdbOverhang                  149     ~RE-DEFINED
sjdbFileChrStartEnd           -        ~RE-DEFINED
sjdbGTFfile                   /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf     ~RE-DEFINED
sjdbGTFchrPrefix              -     ~RE-DEFINED
sjdbGTFfeatureExon            exon     ~RE-DEFINED
sjdbGTFtagExonParentTranscripttranscript_id     ~RE-DEFINED
sjdbGTFtagExonParentGene      gene_id     ~RE-DEFINED
sjdbInsertSave                Basic     ~RE-DEFINED
genomeFileSizes               2821362240   22544256458        ~RE-DEFINED
Genome version is compatible with current STAR
Number of real (reference) chromosomes= 66
1   chr1    195471971   0
2   chr10   130694993   195559424
3   chr11   122082543   326369280
4   chr12   120129022   448528384
5   chr13   120421639   568852480
6   chr14   124902244   689438720
7   chr15   104043685   814481408
8   chr16   98207768    918552576
9   chr17   94987271    1016856576
10  chr18   90702639    1112014848
11  chr19   61431566    1202978816
12  chr2    182113224   1264582656
13  chr3    160039680   1446772736
14  chr4    156508116   1606942720
15  chr5    151834684   1763704832
16  chr6    149736546   1915748352
17  chr7    145441459   2065694720
18  chr8    129401213   2211184640
19  chr9    124595110   2340683776
20  chrMT   16299   2465464320
21  chrX    171031299   2465726464
22  chrY    91744698    2636906496
23  chrJH584299.1   953012  2728656896
24  chrGL456233.1   336933  2729705472
25  chrJH584301.1   259875  2730229760
26  chrGL456211.1   241735  2730491904
27  chrGL456350.1   227966  2730754048
28  chrJH584293.1   207968  2731016192
29  chrGL456221.1   206961  2731278336
30  chrJH584297.1   205776  2731540480
31  chrJH584296.1   199368  2731802624
32  chrGL456354.1   195993  2732064768
33  chrJH584294.1   191905  2732326912
34  chrJH584298.1   184189  2732589056
35  chrJH584300.1   182347  2732851200
36  chrGL456219.1   175968  2733113344
37  chrGL456210.1   169725  2733375488
38  chrJH584303.1   158099  2733637632
39  chrJH584302.1   155838  2733899776
40  chrGL456212.1   153618  2734161920
41  chrJH584304.1   114452  2734424064
42  chrGL456379.1   72385   2734686208
43  chrGL456216.1   66673   2734948352
44  chrGL456393.1   55711   2735210496
45  chrGL456366.1   47073   2735472640
46  chrGL456367.1   42057   2735734784
47  chrGL456239.1   40056   2735996928
48  chrGL456213.1   39340   2736259072
49  chrGL456383.1   38659   2736521216
50  chrGL456385.1   35240   2736783360
51  chrGL456360.1   31704   2737045504
52  chrGL456378.1   31602   2737307648
53  chrGL456389.1   28772   2737569792
54  chrGL456372.1   28664   2737831936
55  chrGL456370.1   26764   2738094080
56  chrGL456381.1   25871   2738356224
57  chrGL456387.1   24685   2738618368
58  chrGL456390.1   24668   2738880512
59  chrGL456394.1   24323   2739142656
60  chrGL456392.1   23629   2739404800
61  chrGL456382.1   23158   2739666944
62  chrGL456359.1   22974   2739929088
63  chrGL456396.1   21240   2740191232
64  chrGL456368.1   20208   2740453376
65  chrJH584292.1   14945   2740715520
66  chrJH584295.1   1976    2740977664
--sjdbOverhang = 149 taken from the generated genome
Started loading the genome: Tue Sep 12 16:14:02 2023

Genome: size given as a parameter = 2821362240
SA: size given as a parameter = 22544256458
SAindex: size given as a parameter = 1
Read from SAindex: pGe.gSAindexNbases=14  nSAi=357913940
nGenome=2821362240;  nSAbyte=22544256458
GstrandBit=32   SA number of indices=5465274292
Shared memory is not used for genomes. Allocated a private copy of the genome.
Genome file size: 2821362240 bytes; state: good=1 eof=0 fail=0 bad=0
Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 2821362240 bytes
SA file size: 22544256458 bytes; state: good=1 eof=0 fail=0 bad=0
Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 22544256458 bytes
Loading SAindex ... done: 1565873619 bytes
Finished loading the genome: Tue Sep 12 16:16:06 2023

Processing splice junctions database sjdbN=267968,   pGe.sjdbOverhang=149 
alignIntronMax=alignMatesGapMax=0, the max intron size will be approximately determined by (2^winBinNbits)*winAnchorDistNbins=589824
Created thread # 1
Created thread # 2
Created thread # 3
Created thread # 4
Created thread # 5
Created thread # 6
Created thread # 7
Created thread # 8
Created thread # 9
Created thread # 10
Created thread # 11
Created thread # 12
Created thread # 13
Created thread # 14
Created thread # 15
Starting to map file # 0
mate 1:   /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz
mate 2:   /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz

Do you have any other suggestions for what could be the issue? Thank you again for your assistance, it is very much appreciated.

alexdobin commented 1 year ago

Hi @ThaliaChaiZhang

the next step will be to figure out which read causes the seg-fault. You can do a binary search through all reads, with --readMapNumber specifying the number of reads to map.

ThaliaChaiZhang commented 1 year ago

Hi Alex,

I've identified the reads causing issues on the raw files for several of them now, although there doesn't seem to be any identifiable issues on the files themselves as far as I can see in fastQC. Any suggestions on how to proceed from here?

alexdobin commented 12 months ago

Hi @ThaliaChaiZhang

What happens if you extract just one read causing trouble and try to map it?

ThaliaChaiZhang commented 12 months ago

Hi Alex,

I don't see a parameter under the input-parameters in the manual to set a specific read to map; is there one I'm missing or should I try to extract it to a separate file and map that?

ThaliaChaiZhang commented 12 months ago

Hi Alex,

I extracted the read manually from the file with vim and ran just that problematic one, which gave a segfault as well.

STAR version=2.7.11a
STAR compilation time,server,dir=2023-09-21T21:18:42-04:00 :/home/tchai/Tools/STAR-2.7.11a/source
STAR git: 
##### Command Line:
STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu --outSAMtype BAM SortedByCoordinate
##### Initial USER parameters from Command Line:
outFileNamePrefix                 /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu
###### All USER parameters from Command Line:
genomeDir                     /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF     ~RE-DEFINED
runThreadN                    16     ~RE-DEFINED
outFilterScoreMinOverLread    0.25     ~RE-DEFINED
outFilterMatchNminOverLread   0.25     ~RE-DEFINED
outFilterMultimapNmax         5     ~RE-DEFINED
readFilesIn                   /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq   /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq        ~RE-DEFINED
outFileNamePrefix             /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu     ~RE-DEFINED
outSAMtype                    BAM   SortedByCoordinate        ~RE-DEFINED
##### Finished reading parameters from all sources

##### Final user re-defined parameters-----------------:
runThreadN                        16
genomeDir                         /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF
readFilesIn                       /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq   /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq   
outFileNamePrefix                 /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu
outSAMtype                        BAM   SortedByCoordinate   
outFilterMultimapNmax             5
outFilterScoreMinOverLread        0.25
outFilterMatchNminOverLread       0.25

-------------------------------
##### Final effective command line:
STAR   --runThreadN 16   --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF   --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq   /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq      --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu   --outSAMtype BAM   SortedByCoordinate      --outFilterMultimapNmax 5   --outFilterScoreMinOverLread 0.25   --outFilterMatchNminOverLread 0.25
----------------------------------------

Number of fastq files for each mate = 1
ParametersSolo: --soloCellFilterType CellRanger2.2 filtering parameters:  3000 0.99 10
WARNING: --limitBAMsortRAM=0, will use genome size as RAM limit for BAM sorting
Finished loading and checking parameters
Reading genome generation parameters:
### STAR   --runMode genomeGenerate      --runThreadN 39   --genomeDir ./GRCm38_chrFasta_wGTF   --genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa      --sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf   --sjdbOverhang 149
### GstrandBit=32
versionGenome                 2.7.4a     ~RE-DEFINED
genomeType                    Full     ~RE-DEFINED
genomeFastaFiles              /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa        ~RE-DEFINED
genomeSAindexNbases           14     ~RE-DEFINED
genomeChrBinNbits             18     ~RE-DEFINED
genomeSAsparseD               1     ~RE-DEFINED
genomeTransformType           None     ~RE-DEFINED
genomeTransformVCF            -     ~RE-DEFINED
sjdbOverhang                  149     ~RE-DEFINED
sjdbFileChrStartEnd           -        ~RE-DEFINED
sjdbGTFfile                   /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf     ~RE-DEFINED
sjdbGTFchrPrefix              -     ~RE-DEFINED
sjdbGTFfeatureExon            exon     ~RE-DEFINED
sjdbGTFtagExonParentTranscripttranscript_id     ~RE-DEFINED
sjdbGTFtagExonParentGene      gene_id     ~RE-DEFINED
sjdbInsertSave                Basic     ~RE-DEFINED
genomeFileSizes               2821362240   22544256458        ~RE-DEFINED
Genome version is compatible with current STAR
Number of real (reference) chromosomes= 66
1   chr1    195471971   0
2   chr10   130694993   195559424
3   chr11   122082543   326369280
4   chr12   120129022   448528384
5   chr13   120421639   568852480
6   chr14   124902244   689438720
7   chr15   104043685   814481408
8   chr16   98207768    918552576
9   chr17   94987271    1016856576
10  chr18   90702639    1112014848
11  chr19   61431566    1202978816
12  chr2    182113224   1264582656
13  chr3    160039680   1446772736
14  chr4    156508116   1606942720
15  chr5    151834684   1763704832
16  chr6    149736546   1915748352
17  chr7    145441459   2065694720
18  chr8    129401213   2211184640
19  chr9    124595110   2340683776
20  chrMT   16299   2465464320
21  chrX    171031299   2465726464
22  chrY    91744698    2636906496
23  chrJH584299.1   953012  2728656896
24  chrGL456233.1   336933  2729705472
25  chrJH584301.1   259875  2730229760
26  chrGL456211.1   241735  2730491904
27  chrGL456350.1   227966  2730754048
28  chrJH584293.1   207968  2731016192
29  chrGL456221.1   206961  2731278336
30  chrJH584297.1   205776  2731540480
31  chrJH584296.1   199368  2731802624
32  chrGL456354.1   195993  2732064768
33  chrJH584294.1   191905  2732326912
34  chrJH584298.1   184189  2732589056
35  chrJH584300.1   182347  2732851200
36  chrGL456219.1   175968  2733113344
37  chrGL456210.1   169725  2733375488
38  chrJH584303.1   158099  2733637632
39  chrJH584302.1   155838  2733899776
40  chrGL456212.1   153618  2734161920
41  chrJH584304.1   114452  2734424064
42  chrGL456379.1   72385   2734686208
43  chrGL456216.1   66673   2734948352
44  chrGL456393.1   55711   2735210496
45  chrGL456366.1   47073   2735472640
46  chrGL456367.1   42057   2735734784
47  chrGL456239.1   40056   2735996928
48  chrGL456213.1   39340   2736259072
49  chrGL456383.1   38659   2736521216
50  chrGL456385.1   35240   2736783360
51  chrGL456360.1   31704   2737045504
52  chrGL456378.1   31602   2737307648
53  chrGL456389.1   28772   2737569792
54  chrGL456372.1   28664   2737831936
55  chrGL456370.1   26764   2738094080
56  chrGL456381.1   25871   2738356224
57  chrGL456387.1   24685   2738618368
58  chrGL456390.1   24668   2738880512
59  chrGL456394.1   24323   2739142656
60  chrGL456392.1   23629   2739404800
61  chrGL456382.1   23158   2739666944
62  chrGL456359.1   22974   2739929088
63  chrGL456396.1   21240   2740191232
64  chrGL456368.1   20208   2740453376
65  chrJH584292.1   14945   2740715520
66  chrJH584295.1   1976    2740977664
--sjdbOverhang = 149 taken from the generated genome
Started loading the genome: Mon Sep 25 17:25:47 2023

Genome: size given as a parameter = 2821362240
SA: size given as a parameter = 22544256458
SAindex: size given as a parameter = 1
Read from SAindex: pGe.gSAindexNbases=14  nSAi=357913940
nGenome=2821362240;  nSAbyte=22544256458
GstrandBit=32   SA number of indices=5465274292
Shared memory is not used for genomes. Allocated a private copy of the genome.
Genome file size: 2821362240 bytes; state: good=1 eof=0 fail=0 bad=0
Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 2821362240 bytes
SA file size: 22544256458 bytes; state: good=1 eof=0 fail=0 bad=0
Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 22544256458 bytes
Loading SAindex ... done: 1565873619 bytes
Finished loading the genome: Mon Sep 25 17:27:20 2023

Processing splice junctions database sjdbN=267968,   pGe.sjdbOverhang=149 
alignIntronMax=alignMatesGapMax=0, the max intron size will be approximately determined by (2^winBinNbits)*winAnchorDistNbins=589824
Created thread # 1
Created thread # 2
Thread #1 end of input stream, nextChar=-1
Created thread # 3
Created thread # 4
Completed: thread #2
Completed: thread #3
Completed: thread #4
Created thread # 5
Completed: thread #5
Created thread # 6
Completed: thread #6

Any thoughts?

Thank you again for your help as well.

ThaliaChaiZhang commented 11 months ago

Hi, we were able to resolve the issue by reindexing the genome index from the GTF files.