alexdobin / STAR

RNA-seq aligner
MIT License
1.82k stars 503 forks source link

Chimeric.out.junction missed spanning reads #749

Open helloeoh001 opened 4 years ago

helloeoh001 commented 4 years ago

Hello Dr. Dobin,

I am optimizing the options of STAR to detect fusion genes, and have two questions.

  1. Unmapped reads in 'Chimeric.out.junction' The unmapped reads with STAR are mapped in Chimeric.out.junction file. How the unmapped reads were mapped?

  2. --alignMatesGapMax and spanning reads supporting gene fusion The paired reads which are mapped perfectly in two different genes in the same chromosome would be missed in Chimeric.out.junction file if the gap between the paired reads is whithin the value of --alignMatesGapMax. What will be good for the option if the goal of RNA analysis is to find fusion genes rather than mapping reads as many as possible.

alexdobin commented 4 years ago

Hi @helloeoh001

  1. Reads are called "mapped" if they may in a non-chimeric way and pass certain filters, such as the minimum mapped length. --outFilter* options control this filtering.

  2. At the mapping step, STAR does not check whether the mapped reads belong to genes, so it cannot make the decision about such fusions. Note that this is also true for intra-mate junctions that connect consequent genes, such junctions are considered novel, but not chimeric, and are output into SJ.out.tab

In principle, these checks could be done by STAR after mapping - I will put it on my TODO list.

Thanks for the good suggestion! Alex