ultimatesource / denovogear

A program to detect denovo-variants using next-generation sequencing data.
http://www.nature.com/nmeth/journal/v10/n10/full/nmeth.2611.html
GNU General Public License v3.0
49 stars 25 forks source link

How could "de-novo" mutation found from a binary coded 0/1 VCF file #280

Closed jielab closed 6 years ago

jielab commented 6 years ago

Hi,

Previously I posted a couple of times, struggling to make DNG work on my VCF file of a trio dataset, but could not get any result. If someone has a working example, where de-novo mutations could be reported by DNG from a VCF file (ideally some publicly available VCF file such as that from 1000GP so that we work on the same thing), can you please kindly share the DNG command line?

I now take a deep breath and think this whole thing through a little bit. Suddenly I think wondering: if a VCF file uses REF/ALT and 0/1 to represent genotype data and all tri-allelic SNPs are reformatted into bi-allelic SNPs, then of course there is no de-novo mutation for any given SNP, because the genotype will always be 0/0 or 0/1 or 1/1 for father and mother and proband. Since there is no genotype of 1/3 or 2/3, nothing will be “de-novo”!

If this is correct, what do I need to do differently with the VCF file that feeds into DNG?

Best regards, jie

jielab commented 6 years ago

For a A/C SNP, if both father and mother have a genotype of A/A while child has a genotype of A/C, is this a de-novo mutation? I guess it is.

However, I though that de-novo mutation also includes the following situation: one of the two parents have A/C, while the child have A/G. Then this becomes a tri-allelic SNP and it will not be stored correctly by VCF files, therefore, this will not be treated as a de-novo mutation. This is my question.

Best regards, jie

reedacartwright commented 6 years ago

The answer to your question is "it depends". DNG does not rely on the genotypes stored in a VCF file when identifying potential de novos. It uses either the PL likelihoods (dng-dnm) or the AD depths (dng-call). If you want me to inspect specific sites that you believe DNG is missing, I need to see a complete line from the VCF along with the #CHROM header line.