stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 45 forks source link

Approximations in susie_z #80

Closed zouyuxin closed 5 years ago

zouyuxin commented 5 years ago

In simulations, I notice some susie_z model produce a warning: IBSS algorithm did not converge in 100 iterations!. This appears when the total PVE is large, (from the simulation, large means >= 0.5).

When I initialize at the truth, the algorithm converges to a lower objective value in some cases. In some other cases, it still fails to converge. I think this is not a convergence problem. This is caused by the approximation for equation using equation. When I fit the model using equation, the algorithm converges.

The data to reproduce the problem: X is N3finemapping from susieR. R is the sample correlation matrix from X. They are available here: X: https://github.com/zouyuxin/dsc-susie-z/blob/master/data/susie_X.rds R: https://github.com/zouyuxin/dsc-susie-z/blob/master/data/susie_R.rds

The simulated data:

  1. One true effect, PVE 0.5, residual variance 0.47^2: https://github.com/zouyuxin/dsc-susie-z/blob/master/data/sim_gaussian_75.rds

  2. Five true effects, total PVE 0.8, residual variance 0.74^2: https://github.com/zouyuxin/dsc-susie-z/blob/master/data/sim_gaussian_475.rds

To check the details of the problem using the simulated data, check https://zouyuxin.github.io/dsc-susie-z/SusieZProblem.html#examples

zouyuxin commented 5 years ago

One quick solution is to require the sample size information, n. Using susie_bhat(bhat = z, shat = 1, R = R, n = n), the model converges.

But if the summary statistics are from meta-analysis, or mixed effect model, it is unclear how to specify n.

stephens999 commented 5 years ago

Why does it converge with n given? What is done differently if N is Unknown? I think this would be really helpful to understand in detail.

On Fri, Jan 18, 2019 at 1:46 PM Yuxin Zou notifications@github.com wrote:

One quick solution is to require the sample size information, n. Using susie_bhat(bhat = z, shat = 1, R = R, n = n), the model converges.

But if the summary statistics are from meta-analysis, or mixed effect model, it is unclear how to specify n.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/80#issuecomment-455665879, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xQv4JWYP1BDhtTfYoGZgCIIRAHDQks5vEiSRgaJpZM4aIhTE .

zouyuxin commented 5 years ago

I will use equation to denote equation, z to denote equation.

I think the key statistics is equation. If we know equation, we can replace z in BF (eqn 7.46) with equation, which gives the BF using sufficient statistics (eqn 7.23 in SuSiE Derivation write-up). The algorithm is same as we have sufficient statistics.

Having equation is unrealistic in real life, but we can derive it using sample size n and equation, (eqn 7.12).

The euqation is unknown, we can either treat it as fixed, or estimate the residual variance.

stephens999 commented 5 years ago

Looking at the code I see 'susie_z' seems to call the summary function with n=2. Why is that? That doesn't seem to appear in the derivations anywhere.

zouyuxin commented 5 years ago

In the main function susie_ss, n is only used to compute the objective elbo. In susie_z, we don't know n. So I set n=2, (there are places using n-1 in the function).

stephens999 commented 5 years ago

when i try to run SusieZProblem.Rmd i get an error at line 121

> fit_z = susie_z(z, R=R, max_iter = 20, L=5, track_fit = T)
Error in qt(pnorm(-abs(z)), df = n - 2) : 
  argument "n" is missing, with no default
Called from: qt(pnorm(-abs(z)), df = n - 2)

I am using susieR_0.6.4.0427

any ideas?

zouyuxin commented 5 years ago

From the error, it seems like a previous version of susieR. I don't know why the version number matches the current one in Github.

Could you re-install it? You need to install from my fork (https://github.com/zouyuxin/susieR) to fully reproduce the result (the logBF plot). I added the logBF vector for each l as output.

zx8754 commented 4 years ago

Is this issue resolved? I am still getting this warning:

Warning message: In susieR::susie_rss(z = z_scores, R = R, L = L.susie, min_abs_corr = min_abs_corr, : IBSS algorithm did not converge in 100 iterations!

R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)
susieR_0.9.0
zouyuxin commented 4 years ago

The issue discussed here was about the function long time ago, which is very different from our current susie_rss. You can try to increase the number of iterations to reach convergence. The other thing to try is increasing z_ld_weight from your original 1/20000 to 1/10000 or 1/1000.

zx8754 commented 4 years ago

This is what I have:

length(z_scores)
# [1] 1863
dim(myPanel)
# [1] 20000  1863
dim(R)
# [1] 1863 1863
L.susie
# [1] 10
min_abs_corr
# [1] 0.5

susie_out = susieR::susie_rss(z = z_scores, R = R, L = L.susie,
                              min_abs_corr = min_abs_corr, check_z = FALSE)