stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 46 forks source link

discrepancy in susie and susie_rss pip #147

Open gk1610 opened 2 years ago

gk1610 commented 2 years ago

For same set of zscores, I see discrepancy in fitted$pip (using susie) and fitted_rss$pip (using susie_rss). There are two types of cases that I observe

1) when fitted_rss returns credible variant but fitted does not along with zero pip (top plot)

2) when fitted_rss returns credible variant but fitted does not along with non-zero pip (bottom plot)

https://www.dropbox.com/s/13uhohmmedwdi7f/Screen%20Shot%202021-11-15%20at%204.39.11%20PM.png?dl=0

Is there any explanation behind this?

I get zscores using two commands below z_scores=qnorm(pvalues from univariate regression) z_scores=sign(beta)*abs(z_scores)

Thanks so much!

gaow commented 2 years ago

@gk1610 are you using LD data computed from the X matrix you input to susie()? Also "z_scores=qnorm(pvalues from univariate regression)" should be "z_scores=qnorm(pvalues / 2)"?

gk1610 commented 2 years ago

1) z_scores=qnorm(pvalues from univariate regression); thats true. I have these pvalues from limma pipeline. For example see toptable results here https://ucdavis-bioinformatics-training.github.io/2018-June-RNA-Seq-Workshop/thursday/DE.html

2) I use X= correlation of genes which I know are highly correlated. My question is which gene is a causal gene in X matrix. I input zscores using pvalues from limma pipeline. I test effect of age ~ gene expression.

It should work, right? same concept but on genes and differential analysis results which are obtained through univariate regression.

stephens999 commented 2 years ago

you can't (or at least should not) use susie on z scores. susie is designed to be given X and y.

and for susie_rss gaow is right you need to use qnorm(p/2), or some variant on that, so p=1 produces a z score of 0.

If you give runnable code to reproduce your problem it will greatly reduce the chances of miscommunication.

On Mon, Nov 15, 2021 at 5:09 PM gk1610 @.***> wrote:

1.

z_scores=qnorm(pvalues from univariate regression); thats true. I have these pvalues from limma pipeline. For example see toptable results here https://ucdavis-bioinformatics-training.github.io/2018-June-RNA-Seq-Workshop/thursday/DE.html 2.

I use correlation of genes which I know are highly correlated because I want to know which gene is a causal gene in that particular region of highly correlated genes.

It should right? same concept but on genes and differential analysis results which are obtained through univariate regression.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/147#issuecomment-969415626, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANXRRLPLX6OCF6IHNJ5ZJ3UMGHJBANCNFSM5ICXEEYA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

stephens999 commented 2 years ago

i got your email with example data. to run the code i had to add library(data.table). But I still get an error:

snp_list=names(zscores_value1) Error: object 'zscores_value1' not found