Open drorberel opened 5 years ago
no, not currently supported.
If you believe effects are small (eg GWAS) then linear regression of (binary) Y on X will likely work ok (could be checked by simulations).
But if you may be in high signal case you will have to wait til we have logistic implemented. (We plan to do this, but it is not trivial).
Matthew
On Wed, Nov 14, 2018 at 5:37 PM Dror Berel notifications@github.com wrote:
Is logistic regression supported? If so, is it enough to have the Y argument in the susie() function to be a vector of {0,1}?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/74, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xeZ7nKw8K2GxJGwuYFgu_aP9Qhf1ks5uvKk7gaJpZM4Yeyyj .
Hi I was wondering if there is any updates on this? This can be of importance for variable selection, since SuSiE shows quite promising for feature selection when comparing to LASSO or Ridge.
Best/Oveis
@karltayeb @andrewg3311
we have a couple of implementations of logistic version now, but not incorporated into the package. i will let @karltayeb and @andrewg3311 point you to what is available
Hi @Ojami ,
I have a package in development that is technically a superset of SuSiE, which includes a logistic version. However, since one of the goals of the package is generality beyond just SiSiE, there isn't the same level of SuSiE-specific interface functions built-in. (e.g. there is no susie_get_cs
function, no plot method, etc)
Adding such SuSiE-specific interface functions is definitely on my to-do list, but I've put adding new features on hold until I write my dissertation.
I believe, partly for this reason, @karltayeb decided to write his own implementation which is SuSiE-specific, and does have a more SuSiE-specific interface. So in the meantime, it might be easier to use his package.
But if you're interested in mine, you can find it here: https://github.com/stephenslab/VEB.Boost/ Be sure to keep an eye out for when I do finally add those SuSiE-specific helper functions!
-Andrew
If you look here: https://github.com/karltayeb/gseasusie
there is a wrapper for VEB boost called fit_logistic_susie_veb_boost
that uses @andrewg3311 's implimentation under the hood, but outputs results in a format that is more consistent with the susieR
interface. There is also fit_logistic_susie
which is a bit slower and less tested so I'd got with the fit_logistic_susie_veb_boost
for now.
It should play nice with susieR
plot functions and some other helper functions in susieR
.
Much appreciated!
Best Oveis
I have been asked from time to time about logistic SuSiE for GWAS. Before it's prime release, we (@zouyuxin with others on my team) have been doing the following:
var_y = residual_variance = 1/(phi*(1-phi))
, where phi = n_cases/total_n
. Use this for var_y
in susie_rss()
, setting n=total_n
.b ~ N(0, sigma2_b)
estimated using other approaches, then in susie_rss
use these parameters: scaled_prior_variance = sigma2_b/var_y, standardize = FALSE, estimate_prior_variance = FALSE
@zouyuxin anything to add? It's been a while and I don't do a lot GWAS these days so please comment.
is there anything specific about "logistic susie for GWAS" vs logistic susie in general? We (@karltayeb) are continuing to work on logistic susie; it has raised more methodological issues than we expected, but I think we have multiple approaches that can work.
@stephens999 sorry I made up the phrase "logistic SuSiE for GWAS". The questions I got are mainly "if SuSiE can analyze case/ctrl data" -- for that I mention the ongoing work on Logistic SuSiE which is a general approach that can be applied to case/ctrl data (@karltayeb has looped me in on the challenges); or, for the time being, leverage the SuSiE RSS model not SuSiE implemented in susieR::susie(x,y)
where y
is case control status because at least in the applications I was consulted, there is an unbalanced case/ctrol data although the total sample size are not exactly very large.
In large scale application such as UKB , perhaps SuSiE RSS should be the only approach due to computational reasons, even if we figured out about Logistic SuSiE?
Is logistic regression supported? If so, is it enough to have the Y argument in the susie() function to be a vector of {0,1}?