rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
182 stars 53 forks source link

Interaction p. value explanation #349

Open ejgardner-insmed opened 1 year ago

ejgardner-insmed commented 1 year ago

Hello,

Thank you for implementing GxG testing in REGENIE, we are finding it very helpful. I had a (hopefully) quick question about how to interpret p. values given by these tests. Let's say we have the following result from a GWAS for phenotype Y:

variants in rs001 increase Y by 1.2 units for p = 1x10-11 variants in rs002 have no effect on Y for p = 0.5 variants in rs003 have no effect on Y for p = 0.2

When running interaction testing for rs001 we obtain the following results (REGENIE 'TEST' column labels):

rs001xrs002: ADD-INT_SNP p = 0.5 ADD-INT_rs001 p = 0.3 ADD-INT_SNPxrs001 p = 1x10-9

rs001xrs003: ADD-INT_SNP p = 0.2 ADD-INT_rs001 p = 3x10-11 ADD-INT_SNPxrs001 p = 0.3

Why do we see a significant result for ADD-INT_rs001 in the rs003, but NOT the rs002 interaction test? Based on my reading of the documentation, the test run here should approximately replicate the GWAS result for rs001 (H0: alpha = 0). However, this only holds true for ~25% of all interaction tests we run against rs001. Additionally, when I run the model manually in R with glm(), I indeed see that the effect of rs001 is maintained on phenotype Y, even when including the interaction term and additional variant (e.g. rs002/rs003). Is there something I am missing here? As far as I am aware, rs002 is NOT in LD with rs001.

Thanks for any help you may be able to offer!

joellembatchou commented 1 year ago

Hi,

Is this a quantitative phenotype or a binary trait? And could you include the full sumstats information (as well as the log file with the command that was run) for these two variant sites tested for interaction with rs001?

ejgardner-insmed commented 1 year ago

Hello Joelle,

Thanks for your reply!

This is a quantitative trait.

The "summary stats" I shared are a toy example, not our real data. I was wondering both 1) if you had observed a similar situation when you run tests, and 2) for slightly more detailed explanations of what is actually being tested for each of those null hypotheses and why we might get such a result. I am happy to share some actual summary stats if that helps, but I need to be careful as it is protected data.

What would you suggest is best?

Best,

Eugene

joellembatchou commented 1 year ago

Hi Eugene,

Please check the documentation for information about the interaction test (note that with extra covariates, these are projected from phenotype, genotype and environmental variable prior to running the test). Is the MAC of the two variants above 1,000 (to see if test based on HLM model is being used)?

ejgardner-insmed commented 1 year ago

Hello Joelle,

I have extensively read the documentation and was still confused about what was being tested, hence why I asked here. I was just wondering how a situation such as the one in my original query could arise – i.e. where the primary SNP (rs001) could be significant in the GWAS but not significant in a subset of epistasis tests, and what that means for our results. Does this mean that we need to be skeptical of results for that particular test?

Regarding HLM – rs001 is MAC > 1000, but the others (rs002/rs003) are MAC < 1000.

Thank you!

joellembatchou commented 1 year ago

The HLM includes both a linear & quadratic term for the interacting variant (rs001), have you checked the significance of that quadratic term in the sumstats file?

ejgardner-insmed commented 1 year ago

Hello Joelle,

Thank you for the response. Sorry for being a bother!

I have dug deeper into this and it have some additional information. I realise now my initial 'example' was misleading. I am performing two GxG tests as part of my analysis:

  1. Interaction SNP x all imputed SNPs (e.g. my example above)
  2. Interaction SNP x all gene collapsing masks (e.g. identical to those provided to Gene-based testing)

On looking at my results/data further, category (1) works fine, it is category (2) where the problem is. This is not surprising as category (1) has been filtered to variants with MAF > 1%. For category (2) failed tests are restricted to gene-collapsing masks with MAC < 1000 after collapsing (e.g. the default cutoff for HLM).

I have now tested the algorithm by setting --rare-mac 1, and have obtained "reasonable" results (i.e. p. values appear to fit the expected), but am a bit worried about pitfalls in the model that I am unaware of. Is there anything that I should be worried about?

Anecdotally, I have run two phenotypes (both quantitative) and for one phenotype, only tests for the interaction variant alone failed (ADD-INT_VAR), but for the other tests for both the interaction effect AND the effect of the interaction variant alone failed (ADD-INT_SNPxVAR & ADD-INT_VAR). Seems weird that I have obtained two different results?

Also, I have looked at the summary stats and, at least for burden tests, I cannot find a square term in either imputed SNPs or gene collapsing masks.

joellembatchou commented 1 year ago

Hi,

Robust SE can become inflated at rarer variants which is why we used HLM for those with default threshold of MAC<1000. So are you building burden masks, storing them to a file and then testing them for interaction with a SNP?

Also, what do you mean by the tests failing? Is it NA in LOG10P column?

It would be easier if you could send example data to better assess your issue.

ejgardner-insmed commented 1 year ago

Hi Joelle,

Thanks a bunch for dealing with my questions, it is much appreciated.

For burden masks, yes. I am building masks and then testing genes for interaction with a SNP. In the future I plan to try mask X mask, as well, but I have not implemented that test yet.

'Failing' here means one of the following:

  1. Identical (or nearly identical) low log10 p. values (e.g. < 1) for all gene/mask pairs with AC < 1000.
  2. Identical (or nearly identical) HIGH log10 p.values (e.g. > 300) for all gene/mask pairs with AC < 1000.

In other words, all gene/mask pairs with AC < 1000 will all have a log10 p. value ~= 0.16 OR ~= 300. The actual value appears to be random (i.e. for different SNPxBurden tests I have gotten different log10 p. values).

What exact example data would you need to evaluate this issue? I am happy to try and provide some, but due to the slightly sensitive nature of my work it is slightly difficult to send through something suitable.

joellembatchou commented 1 year ago

Hi,

Is the SNP excluded from the masks? Having the input files to replicated the run would be useful (phenotype/covariate/genotype files for a subset of the masks with AAC<1000)

ejgardner-insmed commented 1 year ago

Hi Joelle,

The SNP is excluded from the masks.

I'll see what I can do regarding a test dataset and will get back to you sometime soon (likely in 2023).