jaamarks / jaamarks_notebooks

Collection of various projects and procedures documented with Jupyter notebooks.
0 stars 0 forks source link

fwGWAS—Naive Approach #1

Closed jaamarks closed 5 years ago

jaamarks commented 6 years ago

The aim of this fwGWAS method is increase to GWAS power by weighting sequence variants based on their functional annotation, and therefore discover novel associations that were not identified with the traditional approach. The field standard approach to GWAS has been to apply an equal probability of association to all variants being tested, which results in the familiar 5e-8 threshold-of-association variants must meet in order to be deemed significant. Using this fwGWAS method, variants are weighted differently according to their functional annotation. In particular, sequence variants are categorized into one of four functional annotation groups. Each group has a distinct threshold of association—some more liberal while others more stringent—based on the grounds that some variants are more likely than others to affect function and therefore be causal. More specifically, we group variants into the four groups along with their associated thresholds of correlation: Loss of Function (5.5e-7), Moderate Impact (1.1e-7), Low Impact (1.0e-8), and Other (1.7e-9). We applied this fwGWAS method to three sets of published GWAS results—SCZ1, MDD2013, FTND2—and analyze its potential utility. One novel sequence variant was identified in the FTND2 data set, however no new loci were identified; the sequence variant identified was in the well know locus PSMA4 (near CHRNA5). This approach is based off of the 2016 Nature Genetics paper by Sveinbjornsson et al (deCODE).

jaamarks commented 6 years ago

According to the SnpEff Manual a variant can have (and usually has) more than one annotation. The vcf annotation manual has more information about the annotation behavior.

There are several reasons for this:

Effect sort order. When multiple effects are reported, SnpEff sorts the effects the following way:

  1. Putative impact: Effects having higher putative impact are first.
  2. Effect type: Effects assumed to be more deleterious effects first.
  3. Canonical trancript before non-canonical.
  4. Marker genomic coordinates (e.g. genes starting before first).


With this in mind, how should we handle these variants?


Answer

This is not an issue. I will categorize each variant and then compare it to its fwThreshold. If a variant is listed multiple times because of multiple annotations - for reasons listed above - then just filter to uniqueness in the end. If a variant is significant in any way is really what we are looking for.

jaamarks commented 6 years ago

Results Presentation

fwGWAS- Naive Method

SCZ2

fwGWAS – Naïve Method.pptx

jaamarks commented 5 years ago

results presentation

20190110-heroin-fwGWAS-naive-method-scz1+mdd1+ftnd2.pptx

jaamarks commented 5 years ago

I applied fwGWAS to FTND3 and appended the results to the end of the PowerPoint.

20190114-heroin-fwGWAS-naive-method-scz1+mdd1+ftnd2+ftnd3.pptx

jaamarks commented 5 years ago

The latest slide deck is located on Eric's share at: \\RTPNFIL02\eojohnson\Heroin\Heroin fwGWAS\Summary Results Slides