stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 45 forks source link

Effect of prior_weights on PIP #83

Closed jeales closed 5 years ago

jeales commented 5 years ago

Hello, I've been running susie_bhat on some of our eQTL results I've been supplying posterior probabilities, based on various annotations for each SNP to the prior_weights argument. i.e. if the SNP has annotations correlated with greater influence on gene expression then it receives a high posterior prob and this goes straight into prior_weights However I've noticed the use of prior_weights makes the interpretation of susie PIP values more complex. Without using prior_weights I observed that the sum of PIP is ~= to the value of L. This is not true when prior_weights are supplied and susie is returning multiple SNPs with PIP > 0.5 when L=10.

gaow commented 5 years ago

@jeales thanks for the report. Would you mind sharing with us here a reproducible example (saved to an R object) and the code to reproduce the behavior for us to take a look at?

xqwen commented 5 years ago

Hi,

We also came across this issue when comparing susie and dap-g. Currently, regardless of the simulation scenario (i.e. even under the null), we found the posterior expected number of signals, i.e., sum(fitted$pip) is always ~L . We tried to adjust prior_weights, but it did not have a noticeable effect.

Is L dictating prior (and posterior) probabilities by design? (We are happy to share examples, but I think you should be able to reproduce this from the distributed sample code easily.)

gaow commented 5 years ago

@jeales to follow up on the issue, we suspect that your prior_weights does not sum to one, as specified in SuSiE model (the prior_weights applies to each single effect, a one-hot vector in SuSiE model). Failing to do so will lead to different interpretation of results, which I previously thought might be desirable at users' discretion. But clearly it is causing confusions. We have now changed the code to ensure it sums to one internally. Please update the package and let us know how it works.

@xqwen Hi there! The issue you mentioned is a discussion we had earlier:

https://github.com/stephenslab/susieR/issues/76

And in fact, even long before that post @stephens999 and I have discussed off-line on it and decide to do what we are doing now at least in the SuSiE manuscript. In brief, this is by design, as it is consistent with the SuSiE model specification. However if you estimate prior variance (different from the current "default" SuSiE model) the story will be different. See examples below:

> susieR::susie(matrix(rnorm(500 * 5), nrow=500), rnorm(500), L=5)$pip
[1] 0.6372167 0.6216630 0.6275527 0.7638080 0.6919719
> susieR::susie(matrix(rnorm(500 * 5), nrow=500), rnorm(500), L=5, estimate_prior_variance=T)$pip
[1] 0 0 0 0 0
> packageVersion('susieR')
[1] ‘0.6.4.455’

(which is also used in the aforementioned ticket if you are interested in reading it through).

The behavior of estimate_prior_variance is consistent with how we defined PIP in SuSiE manuscript. For the software, I had implemented a function susie_get_pip with parameter prune_by_cs = FALSE by default. When it is set to TRUE, you get:

> res = susieR::susie(matrix(rnorm(500 * 5), nrow=500), rnorm(500), L=5)
> susieR::susie_get_pip(res,prune_by_cs=T)
[1] 0 0 0 0 0

As you can see in the other post we've decided to still keep not filtering it the default behavior ...

However, we have recently been discussing of making estimate_prior_effect = TRUE the default SuSiE model. We will follow up on this thread and update the software if it happens.

stephens999 commented 5 years ago

Yes, by default Susie fixes number of effects at exactly L. Most inferences are robust to overstating L. In particular the number of mappable effects (as determined by number of reported CS after purity filter) Is robust to this.

If you really want to estimate number of effects then you need to set estimate_prior_variance = TRUE.

On Tue, Mar 5, 2019, 09:24 gaow notifications@github.com wrote:

@jeales https://github.com/jeales to follow up on the issue, we suspect that your prior_weights does not sum to one, as specified in SuSiE model (the prior_weights applies to each single effect, a one-hot vector in SuSiE model). Failing to do so will lead to different interpretation of results, which I previously thought might be desirable at users' discretion. But clearly it is causing confusions. We have now changed the code to ensure it sums to one internally. Please update the package and let us know how it works.

@xqwen https://github.com/xqwen Hi there! The issue you mentioned is a discussion we had earlier:

76 https://github.com/stephenslab/susieR/issues/76

And in fact, even long before that thread when @stephens999 https://github.com/stephens999 and I have discussed off-line on it and decide to do what we are doing now. In brief, this is by design, as it is consistent with the SuSiE model specification. However if you estimate prior variance (different from the current "default" SuSiE model) the story will be different. See examples below:

susieR::susie(matrix(rnorm(500 5), nrow=500), rnorm(500), L=5)$pip [1] 0.6372167 0.6216630 0.6275527 0.7638080 0.6919719 susieR::susie(matrix(rnorm(500 5), nrow=500), rnorm(500), L=5, estimate_prior_variance=T)$pip [1] 0 0 0 0 0

(which is also used in the aforementioned ticket if you are interested in reading it through).

We have recently been discussing of making estimate_prior_effect = TRUE the default SuSiE model. We will follow up on this thread and update the software if it happens.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/83#issuecomment-469721735, or mute the thread https://github.com/notifications/unsubscribe-auth/ABt4xRXAFB78qbF5EOWYuAIyduQF79pPks5vTowagaJpZM4bFWKX .

gaow commented 5 years ago

@jeales @xqwen we have just updated susieR to version 0.7.0:

https://github.com/stephenslab/susieR/releases/tag/0.7.0

With this version by default settings, you should see sum of PIP < L when you over specify L.

I'll consider this issue resolved for now -- please reopen and comment if you have follow up discussions that requires our action.