Closed ThaliaChaiZhang closed 11 months ago
Hi @ThaliaChaiZhang
there is nothing suspicious in the Log.out file. If you pre-processed the files before mapping, I would recommend mapping unprocessed files, in case there is an issue with pre-processing.
Hi Alex, I gave trying unprocessed files a try, and so far I seem to getting the same exact error. A copy of a log is provided below:
STAR version=2.7.10b
STAR compilation time,server,dir=2022-11-01T09:53:26-04:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
STAR git: On branch master ; commit c6f8efc2c7043ef83bf8b0d9bed36bbb6b9b1133 ; diff files: CHANGES.md
##### Command Line:
STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --readFilesCommand zcat --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu --outSAMtype BAM SortedByCoordinate
##### Initial USER parameters from Command Line:
outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu
###### All USER parameters from Command Line:
genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF ~RE-DEFINED
runThreadN 16 ~RE-DEFINED
readFilesCommand zcat ~RE-DEFINED
outFilterScoreMinOverLread 0.25 ~RE-DEFINED
outFilterMatchNminOverLread 0.25 ~RE-DEFINED
outFilterMultimapNmax 5 ~RE-DEFINED
readFilesIn /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz ~RE-DEFINED
outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu ~RE-DEFINED
outSAMtype BAM SortedByCoordinate ~RE-DEFINED
##### Finished reading parameters from all sources
##### Final user re-defined parameters-----------------:
runThreadN 16
genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF
readFilesIn /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz
readFilesCommand zcat
outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu
outSAMtype BAM SortedByCoordinate
outFilterMultimapNmax 5
outFilterScoreMinOverLread 0.25
outFilterMatchNminOverLread 0.25
-------------------------------
##### Final effective command line:
STAR --runThreadN 16 --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz --readFilesCommand zcat --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu --outSAMtype BAM SortedByCoordinate --outFilterMultimapNmax 5 --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25
----------------------------------------
Number of fastq files for each mate = 1
Input read files for mate 1 :
-rw-r--r-- 1 tchai tchai 3540342657 Feb 25 2021 /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz
readsCommandsFile:
exec > "/home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_STARtmp/tmp.fifo.read1"
echo FILE 0
zcat "/home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz"
Input read files for mate 2 :
-rw-r--r-- 1 tchai tchai 3535271982 Feb 25 2021 /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz
readsCommandsFile:
exec > "/home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/EC1-1_RefAneTher_rnaseq_joji_R1_dedu_STARtmp/tmp.fifo.read2"
echo FILE 0
zcat "/home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz"
ParametersSolo: --soloCellFilterType CellRanger2.2 filtering parameters: 3000 0.99 10
WARNING: --limitBAMsortRAM=0, will use genome size as RAM limit for BAM sorting
Finished loading and checking parameters
Reading genome generation parameters:
### STAR --runMode genomeGenerate --runThreadN 39 --genomeDir ./GRCm38_chrFasta_wGTF --genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa --sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf --sjdbOverhang 149
### GstrandBit=32
versionGenome 2.7.4a ~RE-DEFINED
genomeType Full ~RE-DEFINED
genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa ~RE-DEFINED
genomeSAindexNbases 14 ~RE-DEFINED
genomeChrBinNbits 18 ~RE-DEFINED
genomeSAsparseD 1 ~RE-DEFINED
genomeTransformType None ~RE-DEFINED
genomeTransformVCF - ~RE-DEFINED
sjdbOverhang 149 ~RE-DEFINED
sjdbFileChrStartEnd - ~RE-DEFINED
sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf ~RE-DEFINED
sjdbGTFchrPrefix - ~RE-DEFINED
sjdbGTFfeatureExon exon ~RE-DEFINED
sjdbGTFtagExonParentTranscripttranscript_id ~RE-DEFINED
sjdbGTFtagExonParentGene gene_id ~RE-DEFINED
sjdbInsertSave Basic ~RE-DEFINED
genomeFileSizes 2821362240 22544256458 ~RE-DEFINED
Genome version is compatible with current STAR
Number of real (reference) chromosomes= 66
1 chr1 195471971 0
2 chr10 130694993 195559424
3 chr11 122082543 326369280
4 chr12 120129022 448528384
5 chr13 120421639 568852480
6 chr14 124902244 689438720
7 chr15 104043685 814481408
8 chr16 98207768 918552576
9 chr17 94987271 1016856576
10 chr18 90702639 1112014848
11 chr19 61431566 1202978816
12 chr2 182113224 1264582656
13 chr3 160039680 1446772736
14 chr4 156508116 1606942720
15 chr5 151834684 1763704832
16 chr6 149736546 1915748352
17 chr7 145441459 2065694720
18 chr8 129401213 2211184640
19 chr9 124595110 2340683776
20 chrMT 16299 2465464320
21 chrX 171031299 2465726464
22 chrY 91744698 2636906496
23 chrJH584299.1 953012 2728656896
24 chrGL456233.1 336933 2729705472
25 chrJH584301.1 259875 2730229760
26 chrGL456211.1 241735 2730491904
27 chrGL456350.1 227966 2730754048
28 chrJH584293.1 207968 2731016192
29 chrGL456221.1 206961 2731278336
30 chrJH584297.1 205776 2731540480
31 chrJH584296.1 199368 2731802624
32 chrGL456354.1 195993 2732064768
33 chrJH584294.1 191905 2732326912
34 chrJH584298.1 184189 2732589056
35 chrJH584300.1 182347 2732851200
36 chrGL456219.1 175968 2733113344
37 chrGL456210.1 169725 2733375488
38 chrJH584303.1 158099 2733637632
39 chrJH584302.1 155838 2733899776
40 chrGL456212.1 153618 2734161920
41 chrJH584304.1 114452 2734424064
42 chrGL456379.1 72385 2734686208
43 chrGL456216.1 66673 2734948352
44 chrGL456393.1 55711 2735210496
45 chrGL456366.1 47073 2735472640
46 chrGL456367.1 42057 2735734784
47 chrGL456239.1 40056 2735996928
48 chrGL456213.1 39340 2736259072
49 chrGL456383.1 38659 2736521216
50 chrGL456385.1 35240 2736783360
51 chrGL456360.1 31704 2737045504
52 chrGL456378.1 31602 2737307648
53 chrGL456389.1 28772 2737569792
54 chrGL456372.1 28664 2737831936
55 chrGL456370.1 26764 2738094080
56 chrGL456381.1 25871 2738356224
57 chrGL456387.1 24685 2738618368
58 chrGL456390.1 24668 2738880512
59 chrGL456394.1 24323 2739142656
60 chrGL456392.1 23629 2739404800
61 chrGL456382.1 23158 2739666944
62 chrGL456359.1 22974 2739929088
63 chrGL456396.1 21240 2740191232
64 chrGL456368.1 20208 2740453376
65 chrJH584292.1 14945 2740715520
66 chrJH584295.1 1976 2740977664
--sjdbOverhang = 149 taken from the generated genome
Started loading the genome: Tue Sep 12 16:14:02 2023
Genome: size given as a parameter = 2821362240
SA: size given as a parameter = 22544256458
SAindex: size given as a parameter = 1
Read from SAindex: pGe.gSAindexNbases=14 nSAi=357913940
nGenome=2821362240; nSAbyte=22544256458
GstrandBit=32 SA number of indices=5465274292
Shared memory is not used for genomes. Allocated a private copy of the genome.
Genome file size: 2821362240 bytes; state: good=1 eof=0 fail=0 bad=0
Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 2821362240 bytes
SA file size: 22544256458 bytes; state: good=1 eof=0 fail=0 bad=0
Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 22544256458 bytes
Loading SAindex ... done: 1565873619 bytes
Finished loading the genome: Tue Sep 12 16:16:06 2023
Processing splice junctions database sjdbN=267968, pGe.sjdbOverhang=149
alignIntronMax=alignMatesGapMax=0, the max intron size will be approximately determined by (2^winBinNbits)*winAnchorDistNbins=589824
Created thread # 1
Created thread # 2
Created thread # 3
Created thread # 4
Created thread # 5
Created thread # 6
Created thread # 7
Created thread # 8
Created thread # 9
Created thread # 10
Created thread # 11
Created thread # 12
Created thread # 13
Created thread # 14
Created thread # 15
Starting to map file # 0
mate 1: /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R1_001.fastq.gz
mate 2: /home/tchai/tools/joji_vessel_rnaseq/01_raw_fq/EC1-1_R2_001.fastq.gz
Do you have any other suggestions for what could be the issue? Thank you again for your assistance, it is very much appreciated.
Hi @ThaliaChaiZhang
the next step will be to figure out which read causes the seg-fault. You can do a binary search through all reads, with --readMapNumber
specifying the number of reads to map.
Hi Alex,
I've identified the reads causing issues on the raw files for several of them now, although there doesn't seem to be any identifiable issues on the files themselves as far as I can see in fastQC. Any suggestions on how to proceed from here?
Hi @ThaliaChaiZhang
What happens if you extract just one read causing trouble and try to map it?
Hi Alex,
I don't see a parameter under the input-parameters in the manual to set a specific read to map; is there one I'm missing or should I try to extract it to a separate file and map that?
Hi Alex,
I extracted the read manually from the file with vim and ran just that problematic one, which gave a segfault as well.
STAR version=2.7.11a
STAR compilation time,server,dir=2023-09-21T21:18:42-04:00 :/home/tchai/Tools/STAR-2.7.11a/source
STAR git:
##### Command Line:
STAR --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --runThreadN 16 --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25 --outFilterMultimapNmax 5 --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu --outSAMtype BAM SortedByCoordinate
##### Initial USER parameters from Command Line:
outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu
###### All USER parameters from Command Line:
genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF ~RE-DEFINED
runThreadN 16 ~RE-DEFINED
outFilterScoreMinOverLread 0.25 ~RE-DEFINED
outFilterMatchNminOverLread 0.25 ~RE-DEFINED
outFilterMultimapNmax 5 ~RE-DEFINED
readFilesIn /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq ~RE-DEFINED
outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu ~RE-DEFINED
outSAMtype BAM SortedByCoordinate ~RE-DEFINED
##### Finished reading parameters from all sources
##### Final user re-defined parameters-----------------:
runThreadN 16
genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF
readFilesIn /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq
outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu
outSAMtype BAM SortedByCoordinate
outFilterMultimapNmax 5
outFilterScoreMinOverLread 0.25
outFilterMatchNminOverLread 0.25
-------------------------------
##### Final effective command line:
STAR --runThreadN 16 --genomeDir /home/tchai/tools/star_index/GRCm38_chrFasta_wGTF --readFilesIn /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R1.fastq /home/tchai/tools/joji_vessel_rnaseq/TestFiles/test_R2.fastq --outFileNamePrefix /home/tchai/tools/joji_vessel_rnaseq/03_aligned_bams/Test_R1_001_dedu --outSAMtype BAM SortedByCoordinate --outFilterMultimapNmax 5 --outFilterScoreMinOverLread 0.25 --outFilterMatchNminOverLread 0.25
----------------------------------------
Number of fastq files for each mate = 1
ParametersSolo: --soloCellFilterType CellRanger2.2 filtering parameters: 3000 0.99 10
WARNING: --limitBAMsortRAM=0, will use genome size as RAM limit for BAM sorting
Finished loading and checking parameters
Reading genome generation parameters:
### STAR --runMode genomeGenerate --runThreadN 39 --genomeDir ./GRCm38_chrFasta_wGTF --genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa --sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf --sjdbOverhang 149
### GstrandBit=32
versionGenome 2.7.4a ~RE-DEFINED
genomeType Full ~RE-DEFINED
genomeFastaFiles /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/fasta/genome_chr.fa ~RE-DEFINED
genomeSAindexNbases 14 ~RE-DEFINED
genomeChrBinNbits 18 ~RE-DEFINED
genomeSAsparseD 1 ~RE-DEFINED
genomeTransformType None ~RE-DEFINED
genomeTransformVCF - ~RE-DEFINED
sjdbOverhang 149 ~RE-DEFINED
sjdbFileChrStartEnd - ~RE-DEFINED
sjdbGTFfile /home/mbassal/genomes/rna/cellranger_references/refdata-cellranger-mm10-3.0.0/genes/genes_chr.gtf ~RE-DEFINED
sjdbGTFchrPrefix - ~RE-DEFINED
sjdbGTFfeatureExon exon ~RE-DEFINED
sjdbGTFtagExonParentTranscripttranscript_id ~RE-DEFINED
sjdbGTFtagExonParentGene gene_id ~RE-DEFINED
sjdbInsertSave Basic ~RE-DEFINED
genomeFileSizes 2821362240 22544256458 ~RE-DEFINED
Genome version is compatible with current STAR
Number of real (reference) chromosomes= 66
1 chr1 195471971 0
2 chr10 130694993 195559424
3 chr11 122082543 326369280
4 chr12 120129022 448528384
5 chr13 120421639 568852480
6 chr14 124902244 689438720
7 chr15 104043685 814481408
8 chr16 98207768 918552576
9 chr17 94987271 1016856576
10 chr18 90702639 1112014848
11 chr19 61431566 1202978816
12 chr2 182113224 1264582656
13 chr3 160039680 1446772736
14 chr4 156508116 1606942720
15 chr5 151834684 1763704832
16 chr6 149736546 1915748352
17 chr7 145441459 2065694720
18 chr8 129401213 2211184640
19 chr9 124595110 2340683776
20 chrMT 16299 2465464320
21 chrX 171031299 2465726464
22 chrY 91744698 2636906496
23 chrJH584299.1 953012 2728656896
24 chrGL456233.1 336933 2729705472
25 chrJH584301.1 259875 2730229760
26 chrGL456211.1 241735 2730491904
27 chrGL456350.1 227966 2730754048
28 chrJH584293.1 207968 2731016192
29 chrGL456221.1 206961 2731278336
30 chrJH584297.1 205776 2731540480
31 chrJH584296.1 199368 2731802624
32 chrGL456354.1 195993 2732064768
33 chrJH584294.1 191905 2732326912
34 chrJH584298.1 184189 2732589056
35 chrJH584300.1 182347 2732851200
36 chrGL456219.1 175968 2733113344
37 chrGL456210.1 169725 2733375488
38 chrJH584303.1 158099 2733637632
39 chrJH584302.1 155838 2733899776
40 chrGL456212.1 153618 2734161920
41 chrJH584304.1 114452 2734424064
42 chrGL456379.1 72385 2734686208
43 chrGL456216.1 66673 2734948352
44 chrGL456393.1 55711 2735210496
45 chrGL456366.1 47073 2735472640
46 chrGL456367.1 42057 2735734784
47 chrGL456239.1 40056 2735996928
48 chrGL456213.1 39340 2736259072
49 chrGL456383.1 38659 2736521216
50 chrGL456385.1 35240 2736783360
51 chrGL456360.1 31704 2737045504
52 chrGL456378.1 31602 2737307648
53 chrGL456389.1 28772 2737569792
54 chrGL456372.1 28664 2737831936
55 chrGL456370.1 26764 2738094080
56 chrGL456381.1 25871 2738356224
57 chrGL456387.1 24685 2738618368
58 chrGL456390.1 24668 2738880512
59 chrGL456394.1 24323 2739142656
60 chrGL456392.1 23629 2739404800
61 chrGL456382.1 23158 2739666944
62 chrGL456359.1 22974 2739929088
63 chrGL456396.1 21240 2740191232
64 chrGL456368.1 20208 2740453376
65 chrJH584292.1 14945 2740715520
66 chrJH584295.1 1976 2740977664
--sjdbOverhang = 149 taken from the generated genome
Started loading the genome: Mon Sep 25 17:25:47 2023
Genome: size given as a parameter = 2821362240
SA: size given as a parameter = 22544256458
SAindex: size given as a parameter = 1
Read from SAindex: pGe.gSAindexNbases=14 nSAi=357913940
nGenome=2821362240; nSAbyte=22544256458
GstrandBit=32 SA number of indices=5465274292
Shared memory is not used for genomes. Allocated a private copy of the genome.
Genome file size: 2821362240 bytes; state: good=1 eof=0 fail=0 bad=0
Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 2821362240 bytes
SA file size: 22544256458 bytes; state: good=1 eof=0 fail=0 bad=0
Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 22544256458 bytes
Loading SAindex ... done: 1565873619 bytes
Finished loading the genome: Mon Sep 25 17:27:20 2023
Processing splice junctions database sjdbN=267968, pGe.sjdbOverhang=149
alignIntronMax=alignMatesGapMax=0, the max intron size will be approximately determined by (2^winBinNbits)*winAnchorDistNbins=589824
Created thread # 1
Created thread # 2
Thread #1 end of input stream, nextChar=-1
Created thread # 3
Created thread # 4
Completed: thread #2
Completed: thread #3
Completed: thread #4
Created thread # 5
Completed: thread #5
Created thread # 6
Completed: thread #6
Any thoughts?
Thank you again for your help as well.
Hi, we were able to resolve the issue by reindexing the genome index from the GTF files.
Hello Alex,
I am experiencing a segfault error with mapping fq files as indicated below.
This same error persists even when using the compiled 2.7.11a linux version, as well as trying compiled versions 2.7.10a, 2.7.9a and 2.7.4a in my environment. ulimit has been set to unlimited, and there is 32 GB of ram available, as well as attempted to change the number of threads used to various amounts from 1 to 20.
Logfile reads:
Any thoughts? Thanks.