alexdobin / STAR

RNA-seq aligner
MIT License
1.84k stars 506 forks source link

De novo RNA seq assembly #505

Closed AlveenaZulfiqar closed 5 years ago

AlveenaZulfiqar commented 6 years ago

Hi I am trying to align reads to genome generated by STAR but getting following Log.progress

Time Speed Read Read Mapped Mapped Mapped Mapped Unmapped Unmapped Unmapped Unmapped M/hr number length unique length MMrate multi multi+ MM short other Oct 15 15:01:08 4.3 95413 263 22.4% 255.4 0.6% 53.6% 0.0% 0.0% 24.0% 0.0% Oct 15 15:02:29 4.3 192067 256 22.5% 247.4 0.8% 55.0% 0.0% 0.0% 22.5% 0.0% Oct 15 15:03:47 4.3 286663 259 22.5% 250.6 0.7% 54.5% 0.0% 0.0% 23.1% 0.0% Oct 15 15:05:05 4.3 381456 260 22.5% 251.8 0.7% 54.2% 0.0% 0.0% 23.3% 0.0% Oct 15 15:06:22 4.4 476298 261 22.5% 252.6 0.7% 54.0% 0.0% 0.0% 23.5% 0.0% Oct 15 15:07:39 4.4 570877 262 22.5% 253.4 0.6% 53.9% 0.0% 0.0% 23.6% 0.0% Oct 15 15:08:55 4.4 665459 262 22.5% 254.0 0.6% 53.8% 0.0% 0.0% 23.7% 0.0% Oct 15 15:10:11 4.4 760240 262 22.4% 254.2 0.6% 53.8% 0.0% 0.0% 23.8% 0.0% Oct 15 15:11:30 4.4 856774 261 22.5% 252.2 0.7% 54.0% 0.0% 0.0% 23.5% 0.0% Oct 15 15:12:45 4.4 951557 261 22.5% 252.6 0.7% 53.9% 0.0% 0.0% 23.6% 0.0% Oct 15 15:14:21 4.3 1046624 261 22.5% 252.7 0.7% 53.9% 0.0% 0.0% 23.6% 0.0% Oct 15 15:16:23 4.1 1141782 261 22.5% 252.5 0.7% 53.9% 0.0% 0.0% 23.6% 0.0% Oct 15 15:18:02 4.1 1236107 261 22.5% 252.9 0.7% 53.8% 0.0% 0.0% 23.7% 0.0% Oct 15 15:19:18 4.1 1330660 261 22.5% 253.1 0.7% 53.8% 0.0% 0.0% 23.7% 0.0% Oct 15 15:20:32 4.1 1425110 262 22.5% 253.4 0.6% 53.7% 0.0% 0.0% 23.8% 0.0% Oct 15 15:21:50 4.1 1519502 262 22.5% 253.5 0.6% 53.7% 0.0% 0.0% 23.8% 0.0% Oct 15 15:23:06 4.2 1613837 262 22.4% 253.8 0.6% 53.7% 0.0% 0.0% 23.9% 0.0% Oct 15 15:24:22 4.2 1708200 262 22.4% 253.9 0.6% 53.7% 0.0% 0.0% 23.9% 0.0%

Uniquely mapped reads are less as compared to multi mapped, can you guide me how to play with parameters to get more uniquely mapped reads. I used following command

./STAR --runMode alignReads --outSAMtype BAM SortedByCoordinate --readFilesCommand zcat --genomeDir ../../genome_output --outFileNamePrefix 1S --readFilesIn ../../1_1_P.fq.gz ../../1_2_P.fq.gz --outFilterMultimapNmax 50

Thanks

alexdobin commented 6 years ago

Hi @AlveenaZulfiqar

the multimappers are not artifacts of mapping - they the read sequences map equally well to several places in your assembly. This could indicate problems with the assembly, but may also reflect biological reality for the species/samples.

Cheers Alex