inab / SmartRNASeqCaller

SmartRNASeqCaller is a post-processing pipeline to improve germline variant calling from RNA-Seq data
GNU Lesser General Public License v3.0
7 stars 3 forks source link

How many variants per cell should I expected after classification ? #2

Closed ahy1221 closed 5 years ago

ahy1221 commented 5 years ago

Aftering running this pipeline on my own Smartseq2 data, I got ~10000 variants a cell. The RF model just filtered out about 1000 ~ 2000 variants. Is that a right number of variants for a cell ? My variant calling pipeline follows GATK4 RNA-seq calling best practice.

mbosio85 commented 5 years ago

Hello, I guess you are working with single cell data. From the tests we ran with bulk RNA we approximately got 60 - 80k variants and about 10k filtered out. The filters rely on coverage as well which may reduce the number of variants you get from GATK from the start. I am not familiar with single cell data but the percentage of filtered variants seems in line with what we get with bulk RNA

Cheers Mattia