sr320 / ceabigr

Workshop on genomic data integration with a emphasis on epigenetic data (FHL 2022)
4 stars 2 forks source link

Provide hard filter values for RNAseq SNPs #62

Closed sr320 closed 2 years ago

sr320 commented 2 years ago

I have updated repo by redoing analysis up to hard filter.. standing by for recommendation on filtering. https://github.com/sr320/ceabigr/blob/main/code/11-RNAseq-snps.Rmd

ksil91 commented 2 years ago

Nice reference on what some of these filters are:
https://gatk.broadinstitute.org/hc/en-us/articles/360035890471-Hard-filtering-germline-short-variants

Ideally, you would plot the distribution of read depth across sites prior to filtering to estimate read depth parameters, but these should be fine unless you end up with very few SNPs passing filter.

Changed DP (read depth) < 15 and > 150, to < 10 and > 200.

Changed minimum allele frequency (AF) < 0.3 to < 0.05 (this is the most important)

--filter-name "FS" \
--filter "FS > 60.0" \
--filter-name "QD" \
--filter "QD < 2.0" \
--filter-name "QUAL30" \
--filter "QUAL < 30.0" \
--filter-name "SOR3" \
--filter "SOR > 3.0" \
--filter-name "DP10" \
--filter "DP < 10" \
--filter-name "DP200" \
--filter "DP > 200" \
--filter-name "AF05" \
--filter "AF < 0.05"