Open nbartonicek opened 5 years ago
Your suggested steps appear good to me. It might be a good idea to try both imputed and non-imputed genotypes. Genotypes without imputation may work well if UMI counts are large enough, and ensuring consistency between them would be useful.
Hyun Min Kang, Ph.D. Associate Professor of Biostatistics University of Michigan, Ann Arbor Email : hmkang@umich.edu
On Thu, Aug 29, 2019 at 11:43 PM nbartonicek notifications@github.com wrote:
We are trying to use genotyping arrays (Axiom) to get VCFs for demuxlet.
Are there any recommended steps / tutorials for SNP cleaning and filtering to optimise the demuxlet yield?
We are currently cleaning our VCF on a) snp missingness b) SNP duplicates c) SNP strand (flipping) before imputing on Michigan server, then lifting over to hg38 and taking only those variants that overlap with exons of protein_coding genes and lncRNAs.
Any help greatly appreciated! :)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/demuxlet/issues/49?email_source=notifications&email_token=ABPY5OOIGJHI7NGYMNKFSJ3QHCJNFA5CNFSM4ISI5AGKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HILE5XQ, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPY5OMX4FXVM4AACPHGTIDQHCJNFANCNFSM4ISI5AGA .
Thank you very much for your quick reply! I will try both methods.
We are trying to use genotyping arrays (Axiom) to get VCFs for demuxlet.
Are there any recommended steps / tutorials for SNP cleaning and filtering to optimise the demuxlet yield?
We are currently cleaning our VCF on a) snp missingness b) SNP duplicates c) SNP strand (flipping) before imputing on Michigan server, then lifting over to hg38 and taking only those variants that overlap with exons of protein_coding genes and lncRNAs.
Any help greatly appreciated! :)