alexdobin / STAR

RNA-seq aligner
MIT License
1.82k stars 500 forks source link

bacteria genome mapping issue #451

Closed lelelililele closed 5 years ago

lelelililele commented 6 years ago

Dear Alex, I used STAR to map RNA-seq to bacteria genome. When I build index,I only use fasta file without gtf file. The parameter "--genomeSAindexNbases" is 10. Then I start mapping step, it returns "/opt/gridview//pbs/dispatcher/mom_priv/jobs/64349.admin.SC: line 10: 46529 Segmentation fault (core dumped)". My command is "home/software/STAR-2.5.2a/bin/Linux_x86_64_static/STAR --runMode alignReads --genomeDir /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv --outFileNamePrefix $work_dir/starResult/${sample} --outSAMattributes All --outSAMtype BAM SortedByCoordinate --read FilesIn ${work_dir}/trimmomatic/${sample}_1P.fq ${work_dir}/trimmomatic/${sample}_2P.fq --runThreadN 40 --alignIntronMax 1"


Here is my log file: STAR version=STAR_2.5.2a STAR compilation time,server,dir=Wed May 25 13:49:21 EDT 2016 florence.cshl.edu:/sonas-hs/gingeras/nlsas_norepl/user/dobin/STAR/STAR.sandbox/source

DEFAULT parameters:

versionSTAR 20201 versionGenome 20101 20200
parametersFiles -
sysShell - runMode alignReads runThreadN 1 runDirPerm User_RWX runRNGseed 777 genomeDir ./GenomeDir/ genomeLoad NoSharedMemory genomeFastaFiles -
genomeSAindexNbases 14 genomeChrBinNbits 18 genomeSAsparseD 1 genomeSuffixLengthMax 18446744073709551615 readFilesIn Read1 Read2
readFilesCommand -
readMatesLengthsIn NotEqual readMapNumber 18446744073709551615 readNameSeparator /
inputBAMfile - bamRemoveDuplicatesType - bamRemoveDuplicatesMate2basesN 0 limitGenomeGenerateRAM 31000000000 limitIObufferSize 150000000 limitOutSAMoneReadBytes 100000 limitOutSJcollapsed 1000000 limitOutSJoneRead 1000 limitBAMsortRAM 0 limitSjdbInsertNsj 1000000 outTmpDir - outTmpKeep None outStd Log outReadsUnmapped None outQSconversionAdd 0 outMultimapperOrder Old_2.4 outSAMtype SAM
outSAMmode Full outSAMstrandField None outSAMattributes Standard
outSAMunmapped None
outSAMorder Paired outSAMprimaryFlag OneBestScore outSAMreadID Standard outSAMmapqUnique 255 outSAMflagOR 0 outSAMflagAND 65535 outSAMattrRGline -
outSAMheaderHD -
outSAMheaderPG -
outSAMheaderCommentFile - outBAMcompression 1 outBAMsortingThreadN 0 outSAMfilter None
outSAMmultNmax 18446744073709551615 outSAMattrIHstart 1 outSJfilterReads All outSJfilterCountUniqueMin 3 1 1 1
outSJfilterCountTotalMin 3 1 1 1
outSJfilterOverhangMin 30 12 12 12
outSJfilterDistToOtherSJmin 10 0 5 10
outSJfilterIntronMaxVsReadN 50000 100000 200000
outWigType None
outWigStrand Stranded
outWigReferencesPrefix - outWigNorm RPM
outFilterType Normal outFilterMultimapNmax 10 outFilterMultimapScoreRange 1 outFilterScoreMin 0 outFilterScoreMinOverLread 0.66 outFilterMatchNmin 0 outFilterMatchNminOverLread 0.66 outFilterMismatchNmax 10 outFilterMismatchNoverLmax 0.3 outFilterMismatchNoverReadLmax 1 outFilterIntronMotifs None clip5pNbases 0
clip3pNbases 0
clip3pAfterAdapterNbases 0
clip3pAdapterSeq -
clip3pAdapterMMp 0.1
winBinNbits 16 winAnchorDistNbins 9 winFlankNbins 4 winAnchorMultimapNmax 50 winReadCoverageRelativeMin 0.5 winReadCoverageBasesMin 0 scoreGap 0 scoreGapNoncan -8 scoreGapGCAG -4 scoreGapATAC -8 scoreStitchSJshift 1 scoreGenomicLengthLog2scale -0.25 scoreDelBase -2 scoreDelOpen -2 scoreInsOpen -2 scoreInsBase -2 seedSearchLmax 0 seedSearchStartLmax 50 seedSearchStartLmaxOverLread 1 seedPerReadNmax 1000 seedPerWindowNmax 50 seedNoneLociPerWindow 10 seedMultimapNmax 10000 alignIntronMin 21 alignIntronMax 0 alignMatesGapMax 0 alignTranscriptsPerReadNmax 10000 alignSJoverhangMin 5 alignSJDBoverhangMin 3 alignSJstitchMismatchNmax 0 -1 0 0
alignSplicedMateMapLmin 0 alignSplicedMateMapLminOverLmate 0.66 alignWindowsPerReadNmax 10000 alignTranscriptsPerWindowNmax 100 alignEndsType Local alignSoftClipAtReferenceEnds Yes alignEndsProtrude 0 ConcordantPair
chimSegmentMin 0 chimScoreMin 0 chimScoreDropMax 20 chimScoreSeparation 10 chimScoreJunctionNonGTAG -1 chimJunctionOverhangMin 20 chimOutType SeparateSAMold chimFilter banGenomicN
chimSegmentReadGapMax 0 sjdbFileChrStartEnd -
sjdbGTFfile - sjdbGTFchrPrefix - sjdbGTFfeatureExon exon sjdbGTFtagExonParentTranscript transcript_id sjdbGTFtagExonParentGene gene_id sjdbOverhang 100 sjdbScore 2 sjdbInsertSave Basic quantMode -
quantTranscriptomeBAMcompression 1 quantTranscriptomeBan IndelSoftclipSingleend twopass1readsN 18446744073709551615 twopassMode None

Command Line:

/home/software/STAR-2.5.2a/bin/Linux_x86_64_static/STAR --runMode alignReads --genomeDir /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv --outFileNamePrefix /home/zhangll/Tasks/WanLi/RNA-seq/starResult/BJCDC-R01-1 --outSAMattributes All --outSAMtype BAM SortedByCoordinate --readFilesIn /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_1P.fq /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_2P.fq --runThreadN 40 --alignIntronMax 1 --alignIntronMin 1

Initial USER parameters from Command Line:

outFileNamePrefix /home/zhangll/Tasks/WanLi/RNA-seq/starResult/BJCDC-R01-1

All USER parameters from Command Line:

runMode alignReads ~RE-DEFINED genomeDir /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv ~RE-DEFINED outFileNamePrefix /home/zhangll/Tasks/WanLi/RNA-seq/starResult/BJCDC-R01-1 ~RE-DEFINED outSAMattributes All ~RE-DEFINED outSAMtype BAM SortedByCoordinate ~RE-DEFINED readFilesIn /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_1P.fq /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_2P.fq ~RE-DEFINED runThreadN 40 ~RE-DEFINED alignIntronMax 1 ~RE-DEFINED alignIntronMin 1 ~RE-DEFINED

Finished reading parameters from all sources
Final user re-defined parameters-----------------:

runMode alignReads runThreadN 40 genomeDir /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv readFilesIn /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_1P.fq /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_2P.fq
outFileNamePrefix /home/zhangll/Tasks/WanLi/RNA-seq/starResult/BJCDC-R01-1 outSAMtype BAM SortedByCoordinate
outSAMattributes All
alignIntronMin 1 alignIntronMax 1


Final effective command line:

/home/software/STAR-2.5.2a/bin/Linux_x86_64_static/STAR --runMode alignReads --runThreadN 40 --genomeDir /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv --readFilesIn /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_1P.fq /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_2P.fq --outFileNamePrefix /home/zhangll/Tasks/WanLi/RNA-seq/starResult/BJCDC-R01-1 --outSAMtype BAM SortedByCoordinate --outSAMattributes All --alignIntronMin 1 --alignIntronMax 1

Final parameters after user input--------------------------------:

versionSTAR 20201 versionGenome 20101 20200
parametersFiles -
sysShell - runMode alignReads runThreadN 40 runDirPerm User_RWX runRNGseed 777 genomeDir /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv genomeLoad NoSharedMemory genomeFastaFiles -
genomeSAindexNbases 14 genomeChrBinNbits 18 genomeSAsparseD 1 genomeSuffixLengthMax 18446744073709551615 readFilesIn /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_1P.fq /home/zhangll/Tasks/WanLi/RNA-seq/trimmomatic/BJCDC-R01-1_2P.fq
readFilesCommand -
readMatesLengthsIn NotEqual readMapNumber 18446744073709551615 readNameSeparator /
inputBAMfile - bamRemoveDuplicatesType - bamRemoveDuplicatesMate2basesN 0 limitGenomeGenerateRAM 31000000000 limitIObufferSize 150000000 limitOutSAMoneReadBytes 100000 limitOutSJcollapsed 1000000 limitOutSJoneRead 1000 limitBAMsortRAM 0 limitSjdbInsertNsj 1000000 outFileNamePrefix /home/zhangll/Tasks/WanLi/RNA-seq/starResult/BJCDC-R01-1 outTmpDir - outTmpKeep None outStd Log outReadsUnmapped None outQSconversionAdd 0 outMultimapperOrder Old_2.4 outSAMtype BAM SortedByCoordinate
outSAMmode Full outSAMstrandField None outSAMattributes All
outSAMunmapped None
outSAMorder Paired outSAMprimaryFlag OneBestScore outSAMreadID Standard outSAMmapqUnique 255 outSAMflagOR 0 outSAMflagAND 65535 outSAMattrRGline -
outSAMheaderHD -
outSAMheaderPG -
outSAMheaderCommentFile - outBAMcompression 1 outBAMsortingThreadN 0 outSAMfilter None
outSAMmultNmax 18446744073709551615 outSAMattrIHstart 1 outSJfilterReads All outSJfilterCountUniqueMin 3 1 1 1
outSJfilterCountTotalMin 3 1 1 1
outSJfilterOverhangMin 30 12 12 12
outSJfilterDistToOtherSJmin 10 0 5 10
outSJfilterIntronMaxVsReadN 50000 100000 200000
outWigType None
outWigStrand Stranded
outWigReferencesPrefix - outWigNorm RPM
outFilterType Normal outFilterMultimapNmax 10 outFilterMultimapScoreRange 1 outFilterScoreMin 0 outFilterScoreMinOverLread 0.66 outFilterMatchNmin 0 outFilterMatchNminOverLread 0.66 outFilterMismatchNmax 10 outFilterMismatchNoverLmax 0.3 outFilterMismatchNoverReadLmax 1 outFilterIntronMotifs None clip5pNbases 0
clip3pNbases 0
clip3pAfterAdapterNbases 0
clip3pAdapterSeq -
clip3pAdapterMMp 0.1
winBinNbits 16 winAnchorDistNbins 9 winFlankNbins 4 winAnchorMultimapNmax 50 winReadCoverageRelativeMin 0.5 winReadCoverageBasesMin 0 scoreGap 0 scoreGapNoncan -8 scoreGapGCAG -4 scoreGapATAC -8 scoreStitchSJshift 1 scoreGenomicLengthLog2scale -0.25 scoreDelBase -2 scoreDelOpen -2 scoreInsOpen -2 scoreInsBase -2 seedSearchLmax 0 seedSearchStartLmax 50 seedSearchStartLmaxOverLread 1 seedPerReadNmax 1000 seedPerWindowNmax 50 seedNoneLociPerWindow 10 seedMultimapNmax 10000 alignIntronMin 1 alignIntronMax 1 alignMatesGapMax 0 alignTranscriptsPerReadNmax 10000 alignSJoverhangMin 5 alignSJDBoverhangMin 3 alignSJstitchMismatchNmax 0 -1 0 0
alignSplicedMateMapLmin 0 alignSplicedMateMapLminOverLmate 0.66 alignWindowsPerReadNmax 10000 alignTranscriptsPerWindowNmax 100 alignEndsType Local alignSoftClipAtReferenceEnds Yes alignEndsProtrude 0 ConcordantPair
chimSegmentMin 0 chimScoreMin 0 chimScoreDropMax 20 chimScoreSeparation 10 chimScoreJunctionNonGTAG -1 chimJunctionOverhangMin 20 chimOutType SeparateSAMold chimFilter banGenomicN
chimSegmentReadGapMax 0 sjdbFileChrStartEnd -
sjdbGTFfile - sjdbGTFchrPrefix - sjdbGTFfeatureExon exon sjdbGTFtagExonParentTranscript transcript_id sjdbGTFtagExonParentGene gene_id sjdbOverhang 100 sjdbScore 2 sjdbInsertSave Basic quantMode -
quantTranscriptomeBAMcompression 1 quantTranscriptomeBan IndelSoftclipSingleend twopass1readsN 18446744073709551615 twopassMode None

WARNING: --limitBAMsortRAM=0, will use genome size as RAM limit for BAM sorting Finished loading and checking parameters Reading genome generation parameters: versionGenome 20201 ~RE-DEFINED genomeFastaFiles /home/zhangll/Anno/STAR_index/Mycobacterium_tuberculosis_H37Rv/Mycobacterium_tuberculosis_H37Rv_genome_v3.fasta ~RE-DEFINED genomeSAindexNbases 10 ~RE-DEFINED genomeChrBinNbits 18 ~RE-DEFINED genomeSAsparseD 1 ~RE-DEFINED sjdbOverhang 0 ~RE-DEFINED sjdbFileChrStartEnd - ~RE-DEFINED sjdbGTFfile - ~RE-DEFINED sjdbGTFchrPrefix - ~RE-DEFINED sjdbGTFfeatureExon exon ~RE-DEFINED sjdbGTFtagExonParentTranscripttranscript_id ~RE-DEFINED sjdbGTFtagExonParentGene gene_id ~RE-DEFINED sjdbInsertSave Basic ~RE-DEFINED Genome version is compatible with current STAR version Number of real (reference) chromosomes= 1 1 NC_000962.3 4411532 0 Started loading the genome: Thu Jul 5 15:40:05 2018

checking Genome sizefile size: 4456448 bytes; state: good=1 eof=0 fail=0 bad=0 checking SA sizefile size: 36395142 bytes; state: good=1 eof=0 fail=0 bad=0 checking /SAindex sizefile size: 6116787 bytes; state: good=1 eof=0 fail=0 bad=0 Read from SAindex: genomeSAindexNbases=10 nSAi=1398100 nGenome=4456448; nSAbyte=36395142 GstrandBit=32 SA number of indices=8823064 Shared memory is not used for genomes. Allocated a private copy of the genome. Genome file size: 4456448 bytes; state: good=1 eof=0 fail=0 bad=0 Loading Genome ... done! state: good=1 eof=0 fail=0 bad=0; loaded 4456448 bytes SA file size: 36395142 bytes; state: good=1 eof=0 fail=0 bad=0 Loading SA ... done! state: good=1 eof=0 fail=0 bad=0; loaded 36395142 bytes Loading SAindex ... done: 6116787 bytes Finished loading the genome: Thu Jul 5 15:40:05 2018

To accomodate alignIntronMax=1 redefined winBinNbits=7 To accomodate alignIntronMax=1 and alignMatesGapMax=0, redefined winFlankNbins=1 and winAnchorDistNbins=2 Created thread # 1 Created thread # 2 Created thread # 3 Created thread # 4 Created thread # 5 Created thread # 6 Created thread # 7 Created thread # 8 Created thread # 9 Created thread # 10 Created thread # 11 Created thread # 12 Created thread # 13 Created thread # 14 Created thread # 15 Created thread # 16 Created thread # 17 Created thread # 18 Created thread # 19 Created thread # 20 Created thread # 21 Created thread # 22 Created thread # 23 Created thread # 24 Created thread # 25 Created thread # 26 Created thread # 27 Created thread # 28 Created thread # 29 Created thread # 30 Created thread # 31 Created thread # 32 Created thread # 33 Created thread # 34 Created thread # 35 Created thread # 36 Created thread # 37 Created thread # 38 Created thread # 39


Hop you can help me to solve my problem.

Yours sincerely, Zhang Lili

alexdobin commented 6 years ago

Hi Zhang Lili,

please try to reduce --genomeSAindexNbases to an even smaller number, i.e. 8 or even 6. Sometimes the required scaling is not precise.

Cheers Alex

apredeus commented 5 years ago

Yep, I just got the same problem with E coli and --genomeSAindexNbases 10 - and it got fixed by changing it to 8.