stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 45 forks source link

Inclusion of "standard" covariates? #82

Closed auton1 closed 3 years ago

auton1 commented 5 years ago

Hi SuSiE,

This looks to be a very exciting package, and I'm keen to try it out in the context of GWAS fine-mapping. Thank you for making your work available.

I'm interested if there is any way to include a set of "standard" covariates in the GWAS context. Currently, if I understand correctly, SuSiE allows one to fit a model like $y=Xb+e$, and will identify non-zero effects and estimate credible set(s) across all variables in X. However, in the GWAS context, one may want to also include a set of covariates (such as age, sex, PCs, etc). Is there any way to do this in SuSiE? I could, for example, just include these covariates in the X matrix, but then I will be "wasting" my budget of non-zero effects when looking for credible sets.

Thanks again, and I hope my question makes sense.

Adam

gaow commented 5 years ago

@auton1 for quantitative trait, you can "remove" covariates beforehand by obtaining the residual in the regression analysis involving covariates only, then use that residual as input to susieR. For example,

y = residuals(lm(y~Z, na.action=na.exclude))

where Z is covariate matrix.

auton1 commented 5 years ago

Thanks. That makes sense. I also saw a comment that you're considering expanding SuSiE to handle logistic regression, where this approach wouldn't be an option. However, I guess I'll punt the question until logistic regression becomes an option :-)

Again, thanks.

stephens999 commented 5 years ago

Yes, just regress your fixed covariates out of both Y and genotypes and run Susie on the residuals.

This is a common question so maybe we should automate this ...

On Wed, Feb 13, 2019, 17:25 Adam Auton notifications@github.com wrote:

Hi SuSiE,

This looks to be a very exciting package, and I'm keen to try it out in the context of GWAS fine-mapping. Thank you for making your work available.

I'm interested if there is any way to include a set of "standard" covariates in the GWAS context. Currently, if I understand correctly, SuSiE allows one to fit a model like $y=Xb+e$, and will identify non-zero effects and estimate credible set(s) across all variables in X. However, in the GWAS context, one may want to also include a set of covariates (such as age, sex, PCs, etc). Is there any way to do this in SuSiE? I could, for example, just include these covariates in the X matrix, but then I will be "wasting" my budget of non-zero effects when looking for credible sets.

Thanks again, and I hope my question makes sense.

Adam

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/82, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xboxuOgy5nNyh-b8yGraUP3mSGJNks5vNJ7UgaJpZM4a6fFl .

stephens999 commented 5 years ago

Ahh, i did not see previous response....

My solution is slightly different because it involves regressing out of the genotypes too, not just y. I think this is more justified, although it may not make much difference in practice...

On Thu, Feb 14, 2019, 06:02 Matthew Stephens stephens999@gmail.com wrote:

Yes, just regress your fixed covariates out of both Y and genotypes and run Susie on the residuals.

This is a common question so maybe we should automate this ...

On Wed, Feb 13, 2019, 17:25 Adam Auton notifications@github.com wrote:

Hi SuSiE,

This looks to be a very exciting package, and I'm keen to try it out in the context of GWAS fine-mapping. Thank you for making your work available.

I'm interested if there is any way to include a set of "standard" covariates in the GWAS context. Currently, if I understand correctly, SuSiE allows one to fit a model like $y=Xb+e$, and will identify non-zero effects and estimate credible set(s) across all variables in X. However, in the GWAS context, one may want to also include a set of covariates (such as age, sex, PCs, etc). Is there any way to do this in SuSiE? I could, for example, just include these covariates in the X matrix, but then I will be "wasting" my budget of non-zero effects when looking for credible sets.

Thanks again, and I hope my question makes sense.

Adam

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/82, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xboxuOgy5nNyh-b8yGraUP3mSGJNks5vNJ7UgaJpZM4a6fFl .

pcarbo commented 5 years ago

@auton1 @gaow varbvs, which can be considered a predecessor to susieR, allows for additional covariates (in the "Z" input argument). The code for handling these covariates is quite straightforward, although it does introduce some subtleties in terms of interpreting the outputs:

remove.covariate.effects <- function (X, Z, y) {
  A   <- forceSymmetric(crossprod(Z))
  SZy <- as.vector(solve(A,c(y %*% Z)))
  SZX <- as.matrix(solve(A,t(Z) %*% X))

  # This should give the same result as centering the columns of X
  # and subtracting the mean from y when we have only one
  # covariate, the intercept.
  y <- y - c(Z %*% SZy)
  X <- X - Z %*% SZX

  return(list(X = X,y = y,SZy = SZy,SZX = SZX))
}

Note that in my case I included the intercept as one of the columns of "Z". (And when there are no covariates, Z is just a column vector of ones.) Hope that is helpful.

gaow commented 5 years ago

Thanks @pcarbo and @stephens999 for pointing out the subtle yet relevant difference. I'll adapt the code to a vignette (https://stephenslab.github.io/susieR/articles/finemapping.html) for a demonstration and point out the caveats.

maguileraf commented 8 months ago

@auton1 for quantitative trait, you can "remove" covariates beforehand by obtaining the residual in the regression analysis involving covariates only, then use that residual as input to susieR. For example,

y = residuals(lm(y~Z, na.action=na.exclude))

where Z is covariate matrix.

@gaow for a binary trait, how is it recommended to "remove" covariates?

gaow commented 8 months ago

@maguileraf for binary traits you may either treat it as quantitative traits (particularly when there are balanced 0 vs 1 responses) and apply the above suggestions, or, apply logistic regression to compute summary statistics accounting for covariates then use susie_rss() on those summary statistics.

maguileraf commented 8 months ago

@gaow thank you for your prompt response. I tried the second one with REGENIE and for some reason my LD matrix is not in concordance with the results from REGENIE, even though I used the same data as I used for REGENIE. Therefore, now I am trying to use susie() instead and see if it works. I do not have balanced 0 vs 1 though.