stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
169 stars 41 forks source link

Logistic regression #74

Open drorberel opened 5 years ago

drorberel commented 5 years ago

Is logistic regression supported? If so, is it enough to have the Y argument in the susie() function to be a vector of {0,1}?

stephens999 commented 5 years ago

no, not currently supported.

If you believe effects are small (eg GWAS) then linear regression of (binary) Y on X will likely work ok (could be checked by simulations).

But if you may be in high signal case you will have to wait til we have logistic implemented. (We plan to do this, but it is not trivial).

Matthew

On Wed, Nov 14, 2018 at 5:37 PM Dror Berel notifications@github.com wrote:

Is logistic regression supported? If so, is it enough to have the Y argument in the susie() function to be a vector of {0,1}?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/74, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xeZ7nKw8K2GxJGwuYFgu_aP9Qhf1ks5uvKk7gaJpZM4Yeyyj .

Ojami commented 2 years ago

Hi I was wondering if there is any updates on this? This can be of importance for variable selection, since SuSiE shows quite promising for feature selection when comparing to LASSO or Ridge.

Best/Oveis

stephens999 commented 2 years ago

@karltayeb @andrewg3311

stephens999 commented 2 years ago

we have a couple of implementations of logistic version now, but not incorporated into the package. i will let @karltayeb and @andrewg3311 point you to what is available

andrewg3311 commented 2 years ago

Hi @Ojami , I have a package in development that is technically a superset of SuSiE, which includes a logistic version. However, since one of the goals of the package is generality beyond just SiSiE, there isn't the same level of SuSiE-specific interface functions built-in. (e.g. there is no susie_get_cs function, no plot method, etc)

Adding such SuSiE-specific interface functions is definitely on my to-do list, but I've put adding new features on hold until I write my dissertation.

I believe, partly for this reason, @karltayeb decided to write his own implementation which is SuSiE-specific, and does have a more SuSiE-specific interface. So in the meantime, it might be easier to use his package.

But if you're interested in mine, you can find it here: https://github.com/stephenslab/VEB.Boost/ Be sure to keep an eye out for when I do finally add those SuSiE-specific helper functions!

-Andrew

karltayeb commented 2 years ago

If you look here: https://github.com/karltayeb/gseasusie there is a wrapper for VEB boost called fit_logistic_susie_veb_boost that uses @andrewg3311 's implimentation under the hood, but outputs results in a format that is more consistent with the susieR interface. There is also fit_logistic_susie which is a bit slower and less tested so I'd got with the fit_logistic_susie_veb_boost for now. It should play nice with susieR plot functions and some other helper functions in susieR.

Ojami commented 2 years ago

Much appreciated!

Best Oveis

gaow commented 1 year ago

I have been asked from time to time about logistic SuSiE for GWAS. Before it's prime release, we (@zouyuxin with others on my team) have been doing the following:

  1. define var_y = residual_variance = 1/(phi*(1-phi)), where phi = n_cases/total_n. Use this for var_y in susie_rss(), setting n=total_n.
  2. for those who have a fixed prior distribution for effect size, b ~ N(0, sigma2_b) estimated using other approaches, then in susie_rss use these parameters: scaled_prior_variance = sigma2_b/var_y, standardize = FALSE, estimate_prior_variance = FALSE

@zouyuxin anything to add? It's been a while and I don't do a lot GWAS these days so please comment.

stephens999 commented 1 year ago

is there anything specific about "logistic susie for GWAS" vs logistic susie in general? We (@karltayeb) are continuing to work on logistic susie; it has raised more methodological issues than we expected, but I think we have multiple approaches that can work.

gaow commented 1 year ago

@stephens999 sorry I made up the phrase "logistic SuSiE for GWAS". The questions I got are mainly "if SuSiE can analyze case/ctrl data" -- for that I mention the ongoing work on Logistic SuSiE which is a general approach that can be applied to case/ctrl data (@karltayeb has looped me in on the challenges); or, for the time being, leverage the SuSiE RSS model not SuSiE implemented in susieR::susie(x,y) where y is case control status because at least in the applications I was consulted, there is an unbalanced case/ctrol data although the total sample size are not exactly very large.

In large scale application such as UKB , perhaps SuSiE RSS should be the only approach due to computational reasons, even if we figured out about Logistic SuSiE?