stephenslab / susieR

R package for "sum of single effects" regression.

https://stephenslab.github.io/susieR

Other

176 stars 46 forks source link

Running SusieR with dominance model #187

Open saulpierotti opened 1 year ago

saulpierotti commented 1 year ago

Question A

Hi, thank you for developing this very nice method. I am running a linear model that takes into account a genetic dominance term like:

y ~ x + d + C where d = (x == 1) and x_i is a genotype in [0,1,2], and C is a matrix of covariates

I am calculating p-values with a likelyhood ratio test of this model against the null model

y ~ C

Do you have any suggestion on what would be the best way to incorporate the dominance term in SusieR? I thought about regressing d out from x and y but in this way I would lose the non-linear component of the signal in the fine-mapping.

Question B

Actually to be more specific I am using a mixed model and the y, x, d, and C terms are first rotated with the Cholesky decomposition of the variance-covariance matrix to remove the correlation structure. Do you think is it ok to run SusieR on the rotated X and y with the rotated C already regressed out?

pcarbo commented 1 year ago

@saulpierotti-ebi You should be able to run susie_suff_stat or susie_rss in both cases so long as you are able to obtain estimates of the effects and standard errors (or other summary statistics such as z-scores) from your SNP-by-SNP tests.

saulpierotti commented 1 year ago

Thank you for the answer! In this case but I should either put the betas for the linear term or for the dominance term and relative standard errors - do you think is it ok to put both the SNP variables and the dominance variables at the same time in input to SusieR? I don't need to do from summary statistics, I can just load the genotype matrix and phenotypes.

pcarbo commented 1 year ago

I think you could do any of these—the best approach would depend on the aims of your analysis.

stephens999 commented 1 year ago

I think Question A is interesting, and although @pcarbo 's suggestion might work I have to admit I am not 100% confident.

The dominance model is an example of a more generally non-linear model and I have implemented non-linear versions of susie in a branch (susie_stumps) that was never merged with the main branch. Maybe this question could motivate us to look at that again. [It also relates to @karltayeb 's work on generalizing IBSS to any setting with an SER function]

I think Question B is related to issue #168

VitorAguiar commented 1 year ago

I'm also interested in this discussion. I have summary stats with OR and p-values for both the additive and dominance model. For some SNPs, the p-value for the dominance model is the best. At each SNP, should I use the best model, in which case I'd have z-scores coming from the additive model for some SNPs, and from the dominance model for other SNPs?

pcarbo commented 1 year ago

I looked back at the Servin and Stephens paper which discusses dominance effects. It should be quite straightforward to implement a variant of susie allowing for dominance effects, perhaps using the priors in Servin and Stephens, but this has not been done yet.

stephens999 commented 1 year ago

I would stick to additive model until we have a version that could deal properly with dominance