chr1swallace / coloc

Repo for the R package coloc
144 stars 44 forks source link

problem with credible sets in susie branch #46

Closed matei-ionita closed 3 years ago

matei-ionita commented 3 years ago

Hi Chris,

I'm running the "susie" branch of coloc, using the latest version of susieR@master, and I noticed strange behavior in the Susie vignette. Your example dataset D3 is supposed to have two credible sets, corresponding to variants s25 and s25.1. However, the output that I get is:

S3 <- runsusie(D3,nref=503) running iterations: 100 converged: TRUE summary(S3)

Variables in credible sets:

variable variable_prob cs 101 1 2

Credible sets summary:

cs cs_log10bf cs_avg_r2 cs_min_r2 variable 2 0.01342722 1 1 101

I obtain the correct output if I try to run Susie without the null_weight parameter, like so:

z <- D3$beta/sqrt(D3$varbeta) R <- D3$LD S3noPrior <- susieR::susie_rss(z=z, R=R, z_ld_weight=1/503) summary(S3noPrior)

Variables in credible sets:

variable variable_prob cs 25 1 2 75 1 3

Credible sets summary:

cs cs_log10bf cs_avg_r2 cs_min_r2 variable 2 22.78721 1 1 25 3 22.78702 1 1 7

However, getting rid of null_weight leads to downstream errors in the coloc pipeline. Can you help me figure out what's happening?

Best, Matei

chr1swallace commented 3 years ago

Dear Matei,

Thank you for spotting this. I'm not sure if something has changed with susie, because I do think it did used to work correctly on D3. Anyway, I have made the signals stronger now in the simulated dataset, so it will find them.

The alternative is to pass p=NULL or null_weight=NULL to runsusie() to avoid null_weight getting set. (You need to grab the latest coloc to enable this).

Chris

S3=runsusie(D3,nref=503) running iterations: 100 converged: TRUE summary(S3)

Variables in credible sets:

variable variable_prob cs 25 1 1 75 1 2

Credible sets summary:

cs cs_log10bf cs_avg_r2 cs_min_r2 variable 1 51.57052 1 1 25 2 51.57357 1 1 75

S3=runsusie(D3,nref=503,null_weight=NULL) running iterations: 100 converged: TRUE summary(S3)

Variables in credible sets:

variable variable_prob cs 25 1 1 75 1 2

Credible sets summary:

cs cs_log10bf cs_avg_r2 cs_min_r2 variable 1 53.57112 1 1 25 2 53.57401 1 1 75

On Thu, 2021-03-11 at 16:13 -0800, Matei Ionita wrote:

Hi Chris,

I'm running the "susie" branch of coloc, using the latest version of @.***, and I noticed strange behavior in the Susie vignette. Your example dataset D3 is supposed to have two credible sets, corresponding to variants s25 and s25.1. However, the output that I get is:

S3 <- runsusie(D3,nref=503) running iterations: 100 converged: TRUE summary(S3)

Variables in credible sets:

variable variable_prob cs 101 1 2

Credible sets summary:

cs cs_log10bf cs_avg_r2 cs_min_r2 variable 2 0.01342722 1 1 101

I obtain the correct output if I try to run Susie without the null_weight parameter, like so:

z <- D3$beta/sqrt(D3$varbeta) R <- D3$LD S3noPrior <- susieR::susie_rss(z=z, R=R, z_ld_weight=1/503) summary(S3noPrior)

Variables in credible sets:

variable variable_prob cs 25 1 2 75 1 3

Credible sets summary:

cs cs_log10bf cs_avg_r2 cs_min_r2 variable 2 22.78721 1 1 25 3 22.78702 1 1 7

However, getting rid of null_weight leads to downstream errors in the coloc pipeline. Can you help me figure out what's happening?

Best, Matei

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

matei-ionita commented 3 years ago

Thank you so much for addressing this right away. I installed the latest version of the susie branch, and indeed the credible sets for D3 look good now. Although it seems that you increased the z-score to 12, something that I can't afford to do with real data :) The low power could be because of susie, so I'll try to understand that method better before I bother you with other questions about this.

Unfortunately, the alternative of passing p=NULL or null_weight=NULL still doesn't work for me. The coloc.susie function expects there to be a column called "null":

S3 <- runsusie(D3,nref=503, p=NULL) running iterations: 100 converged: TRUE S4 <- runsusie(D4,nref=503, p=NULL) running iterations: 100 converged: TRUE susie.res <- coloc.susie(S3,S4) Using 100/ 99 and 99 available Error in bf_unscaled[, "null"] : subscript out of bounds

chr1swallace commented 3 years ago

I didn't test coloc.susie sorry, but it's an easy fix for me to make on Monday

http://chr1swallace.github.io

On 12 Mar 2021, 23:14, at 23:14, Matei Ionita @.***> wrote:

Thank you so much for addressing this right away. I installed the latest version of the susie branch, and indeed the credible sets for D3 look good now. Although it seems that you increased the z-score to 12, something that I can't afford to do with real data :) The low power could be because of susie, so I'll try to understand that method better before I bother you with other questions about this.

Unfortunately, the alternative of passing p=NULL or null_weight=NULL still doesn't work for me. The coloc.susie function expects there to be a column called "null": S3 <- runsusie(D3,nref=503, p=NULL) running iterations: 100 converged: TRUE

S4 <- runsusie(D4,nref=503, p=NULL) running iterations: 100 converged: TRUE susie.res <- coloc.susie(S3,S4) Using 100/ 99 and 99 available Error in bf_unscaled[, "null"] : subscript out of bounds

-- You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHub: https://github.com/chr1swallace/coloc/issues/46#issuecomment-797807540

matei-ionita commented 3 years ago

Sounds good, thank you!

chr1swallace commented 3 years ago

Hi Matei,

apologies, it took me a little longer to get to this. I realised we can't set null_weight=NULL, because I need SuSiE to give me a posterior probability for the null hypothesis so I can back-calculate the Bayes factors needed for coloc. Setting null_weight to different values does seem to affect the SuSiE inference (not unreasonably, of course!) I played with a few datasets to find a default that worked for me, but this is now a parameter exposed to the user, so you can try other values if the default isn't any good.

Please see https://chr1swallace.github.io/coloc/articles/a06_SuSiE.html#important-notes-on-running-susie-1 and let me know how you get on?

best, C

On Fri, 2021-03-12 at 23:25 +0000, Chris Wallace wrote:

I didn't test coloc.susie sorry, but it's an easy fix for me to make on Monday

http://chr1swallace.github.io On 12 Mar 2021, at 23:14, Matei Ionita @.***> wrote:

Thank you so much for addressing this right away. I installed the latest version of the susie branch, and indeed the credible sets for D3 look good now. Although it seems that you increased the z- score to 12, something that I can't afford to do with real data :) The low power could be because of susie, so I'll try to understand that method better before I bother you with other questions about this.

Unfortunately, the alternative of passing p=NULL or null_weight=NULL still doesn't work for me. The coloc.susie function expects there to be a column called "null": S3 <- runsusie(D3,nref=503, p=NULL) running iterations: 100 converged: TRUE

S4 <- runsusie(D4,nref=503, p=NULL) running iterations: 100 converged: TRUE susie.res <- coloc.susie(S3,S4) Using 100/ 99 and 99 available Error in bf_unscaled[, "null"] : subscript out of bounds

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub, or unsubscribe.