Closed yz201906 closed 5 years ago
I forgot to post the Log.final.out:
Started job on | Jul 19 19:50:39
Started mapping on | Jul 19 19:51:50
Finished on | Jul 19 19:52:11
Mapping speed, Million of reads per hour | 136.59
Number of input reads | 796759
Average input read length | 77
UNIQUE READS:
Uniquely mapped reads number | 409013
Uniquely mapped reads % | 51.33%
Average mapped length | 69.80
Number of splices: Total | 81889
Number of splices: Annotated (sjdb) | 81352
Number of splices: GT/AG | 61535
Number of splices: GC/AG | 18931
Number of splices: AT/AC | 160
Number of splices: Non-canonical | 1263
Mismatch rate per base, % | 0.60%
Deletion rate per base | 0.00%
Deletion average length | 1.77
Insertion rate per base | 0.02%
Insertion average length | 1.12
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 347387
% of reads mapped to multiple loci | 43.60%
Number of reads mapped to too many loci | 19388
% of reads mapped to too many loci | 2.43%
UNMAPPED READS:
Number of reads unmapped: too many mismatches | 7948
% of reads unmapped: too many mismatches | 1.00%
Number of reads unmapped: too short | 10742
% of reads unmapped: too short | 1.35%
Number of reads unmapped: other | 2281
% of reads unmapped: other | 0.29%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
Also here is the flagstats before filtering out improperly paired reads using samtools view:
2800788 + 0 in total (QC-passed reads + QC-failed reads)
1424653 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
2800788 + 0 mapped (100.00% : N/A)
1376135 + 0 paired in sequencing
638157 + 0 read1
737978 + 0 read2
1239470 + 0 properly paired (90.07% : N/A)
1239470 + 0 with itself and mate mapped
136665 + 0 singletons (9.93% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
and after:
2420388 + 0 in total (QC-passed reads + QC-failed reads)
1180918 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
2420388 + 0 mapped (100.00% : N/A)
1239470 + 0 paired in sequencing
619735 + 0 read1
619735 + 0 read2
1239470 + 0 properly paired (100.00% : N/A)
1239470 + 0 with itself and mate mapped
0 + 0 singletons (0.00% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
Although it is showing 100% properly paired reads, it seems like STAR marks those pairs aligned to the same strand as concordant pairs.
Hi @yz201906
if you use --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 (or any values that are <0.5), STAR will output single-end alignments, which may result in the output of two discordant alignments for each mate. The parameters above control the min mapped length/score normalized to the total read length (sum of two mates), so to prevent one-mate alignments you need to set them >0.5 . Note, that it may not be enough to prevent one-mate alignments if one of the mates is trimmed too much. To make sure that no one-mate alignments are output, use --outFilterMatchNmin <UntrimmedMateLength+1>.
Cheers Alex
Ah, I see, thanks for the clarification.
I have a set of trimmed PE reads (originally 2 by 75) aligned to mm10 reference genome. Command used:
STAR --runThreadN 12 --outMultimapperOrder Random --outFilterScoreMinOverLread 0.3 --outFilterMatchNminOverLread 0.3 --outFilterMismatchNoverLmax 0.05 --limitBAMsortRAM 1006613967 --peOverlapNbasesMin 50 --genomeDir $genome_path --readFilesIn $R1 $R2 --outFileNamePrefix ./$sample/ --outSAMtype BAM SortedByCoordinate
As I understand,STAR by default should only output concordant alignments, however, when I convert the output bam to bedpe using bamtobed, I see a lot of this:These pairs are definitely discordant. I understand the fields are flagged with 3,0,1 etc not 255, but not every tool utilizes that field. Is this behavior intended? We rely on bedtools genomecov to visualize our data as this seems to be the only tool that we can easily calculate base coverage of only 3'-end of every alignment. And unfortunately, it also doesn't care about the flags. Is there a way to completely not output these alignments to the bam?