alexdobin / STAR

RNA-seq aligner
MIT License
1.86k stars 506 forks source link

unmapped output of the STAR #1463

Open ryrl9703 opened 2 years ago

ryrl9703 commented 2 years ago

Hello, everyone. When i use star to align the RNA seq date to human genome, the output of the unmapped.out.mate is always large, nd it unique mapped is about 70%. Is it normal?

STAR --genomeDir $index --genomeLoad NoSharedMemory --runThreadN $4 --sjdbOverhang 99 --sjdbGTFfile $gtf --alignIntronMin 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --twopassMode Basic --quantMode GeneCounts --runMode alignReads --readFilesCommand zcat --readFilesIn $2/$id.fastq.gz --outSAMunmapped None --outSAMattributes All --outFilterType BySJout --outReadsUnmapped Fastx --outFilterMultimapNmax 1 --outFilterMismatchNmax 999 --outFileNamePrefix $3/$id --outSAMstrandField intronMotif --outFilterMismatchNoverLmax 0.04 --outSAMtype BAM SortedByCoordinate

alexdobin commented 2 years ago

Hi @ryrl9703

70% of unique mappers is not stellar, but it's not bad either. It depends on many factors, both biological (repeat expression, pseudogenes, gene families, etc) and technical (library quality, read length, etc). If you post the entire Log.final.out, I can point other important quality metrics.

Cheers Alex