vasiliosz / clingen-shared

Shared code snippets for Clinical Genetics
0 stars 0 forks source link

Handling of variant on X chromosome for male patients - GATK and GEMINI #2

Open genecracker opened 9 years ago

genecracker commented 9 years ago

I am opening an issue here as I still have problems with handling variants on the X chromosome during trio analyses and would like to hear your experience about it.

  1. Phase by Transmission is not ploidy aware and wt genotypes in fathers are often "forced" to heterozygous (even with --DeNovoPrior 5.0E-4) http://gatkforums.broadinstitute.org/discussion/3829/wrong-genotype-correction-for-hemizygote-after-phasebytransmission
  2. GEMINI does not have any built-in tools for X-linked variants (although it now has a mendel_errorsbuilt-in tools)

How do you handle X-linked inheritance? Any of you following this? http://gatkforums.broadinstitute.org/discussion/4639/x-chromosome-gentyping

vasiliosz commented 9 years ago

I haven't really needed this previously (=a priori suspicion of X linked inheritance) - but I would incorporate the ploidy as in the pipeline mentioned in the discussion you linked.

To automate it, you could leverage the gender info in the ped file (look up sample id and check gender) to process them accordingly.

Does this not work? Is it the phasing specifically that creates problems? I imagine that the genotype qualities for male X chr calls would improve by using ploidy 1 - perhaps avoiding genotype changes in the phase by transmission tool later (which depend on the quality scores).

Skickat från min iPhone

21 jun 2015 kl. 15:59 skrev Bianca Tesi notifications@github.com:

I am opening an issue here as I still have problems with handling variants on the X chromosome during trio analyses and would like to hear your experience about it.

Phase by Transmission is not ploidy aware and wt genotypes in fathers are often "forced" to heterozygous (even with --DeNovoPrior 5.0E-4) http://gatkforums.broadinstitute.org/discussion/3829/wrong-genotype-correction-for-hemizygote-after-phasebytransmission

GEMINI does not have any built-in tools for X-linked variants (although it now has a mendel_errorsbuilt-in tools)

How do you handle X-linked inheritance? Any of you following this? http://gatkforums.broadinstitute.org/discussion/4639/x-chromosome-gentyping

— Reply to this email directly or view it on GitHub.

genecracker commented 9 years ago

I have not tried the suggestion from the GATK forum before, but I will since it sounds proper. Will report here how it goes.

fulyataylan commented 9 years ago

:) I was thinking about the X-linked inheritance for a long time. I have now two cases that I should consider X-linked inheritance.

My practical solution is that since I run ANNOVAR, I simply focus on X linked homozygous rare variants in X chromosome and visualise them on IGV.

vasiliosz commented 9 years ago

Related (but not a full solution) to this: Gemini just released version 0.16.0. Promises improvements to the Mendelian inheritance tools.

https://groups.google.com/forum/#!topic/gemini-variation/rWPu2JJTPTA

genecracker commented 9 years ago

Yeah, since version I 0.16.0 does phasing while running the comp_hets we could skip GATK PbT and build a relevant query for X-linked. For other inheritance model we don't really need phasing, which can also complicate things when searching for de novo variants (remember USP9X).

I have installed version 0.16.0 and tested. Happy if you do it too. autosomal_recessive and de_novo run fine. Get error with comp_hets and mendel_errors.

here below the error I get while running the comp_hets tools. I will wait a few more days to see if anyone report it on the google group, otherwise I will report it myself. I wonder if it is because I have a mixed databases of single cases and trios.

[bianca@milou2 gemini_0.16.0]$ gemini comp_hets --families A1102trio --filter "impact_severity!='LOW' and (aaf_1kg_all <= 0.05 or aaf_1kg_all is NULL)" WESfull150516.HC.genotypeGVCFs.recal.PASS.combined.VEPannotated.db > A1102/A1102trio.HC.genotypeGVCFs.recal.PASS.combined.VEPannotated.comphet.notlow.5procent.150621.txt Traceback (most recent call last): File "/home/bianca/glob/gemini/bin/gemini", line 6, in <module> gemini.gemini_main.main() File "/home/bianca/glob/gemini/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1121, in main args.func(parser, args) File "/home/bianca/glob/gemini/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 672, in comp_hets_fn CompoundHet(args).run() File "/home/bianca/glob/gemini/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 244, in run for i, s in enumerate(self.report_candidates()): File "/home/bianca/glob/gemini/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 169, in report_candidates for gene, li in self.candidates(): File "/home/bianca/glob/gemini/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 464, in candidates samples_w_hetpair = self.find_valid_het_pairs(sample_hets) File "/home/bianca/glob/gemini/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 362, in find_valid_het_pairs alt_hap_2 = alleles_site2.index(site2.row['alt']) ValueError: u'A' is not in list

vasiliosz commented 9 years ago

Ok. I have some re-processing to do before I can test it - but I need to do it anyway (later in the week).

The phasing in Gemini is likely unambiguous sites - not taking genotyping quality into account as in the GATK tool. Would be interesting to see how many sites are phased with the respective tools (if that's even possible).

The error you posted seems to be a bug to me. It's failing on a particular 'alt' genotype. I would post it to their github issues or google group.

genecracker commented 9 years ago

Recommend to wait for the next release before updating GEMINI. A few problems with 0.16.0 including what I reported.

genecracker commented 9 years ago

In GEMINI's history :smile:

screen shot 2015-06-26 at 09 56 28

vasiliosz commented 9 years ago

Nice job!