Open tianwen0003 opened 2 years ago
we recommend to use estimate_residual_variance = FALSE
with susie_rss
If the PIP=0 with this setting then this suggests there are no strong signals in the region. Do you have strong signals in the region?
sorry for my late response.
estimate_residual_variance = FALSE
for all runs or the runs reported error?@tianwen0003 Which link cannot be opened exactly?
Sorry, the link is in this https://stephenslab.github.io/susie-paper/finemapping_benchmark.html
Thanks. It should be this: https://stephenslab.github.io/susie-paper/manuscript_results/GTEx_Association_Preprocessing.html
Actually, the signals in the regions are not weak, instead, the signals are very strong, the nominal p-values are range from 8.134e-05 to 3.09152e-50.
then PIP should not be 0 with estimate_residual_var = FALSE
. Something is wrong. Can you try with L=1
?
Do you mean to use estimate_residual_variance = FALSE for all runs or the runs reported error?
When using summary data (susie_rss
) the recommendation is always FALSE.
When using individual level data (susie
) the recommendation is TRUE
(And these are the software defaults).
then PIP should not be 0 with estimate_residual_var = FALSE. Something is wrong. Can you try with L=1?
Sorry, I checked the results, L=10
and L=1
did get non-zero PIP and 95% credible set.
When using summary data (susie_rss) the recommendation is always FALSE.
I found in the guide from https://stephenslab.github.io/susieR/articles/finemapping_summary_statistics.html, the suggestion is set estimate_residual_var = TRUE
when use in-sample LD matrix, which is the data type I have. so, in my pipeline, I should set it to TRUE, right?
Hi
I just got the same error(2 out of 3000 runs) as #90 when did fine-mapping by susie_rss function.
*Error in susie_suff_stat(XtX = XtX, Xty = Xty, n = n, yty = (n - 1) : Estimating residual variance failed: the estimated value is negative Calls: susie_rss -> susie_suff_stat Execution halted**
when set estimate_residual_variance = FALSE, the function run well but all pip are zero
Here, I provide the all files for you to test ENSG00000149084.7.vcf.gz
ENSG00000149084.7_fastqtl.zip In this file, the second column is the "slope" from fastqtl, the thrid column is the "slope_se" from fastqtl
ENSG00000149084.7_ldMatrix.zip This is the ld matrix calculated from plink --r
the sample number n = 204
var_y = 0.960491
can you provide the code you run to apply susie to these input files so we can reproduce the problem?
can you provide the code you run to apply susie to these input files so we can reproduce the problem?
Sure, the code and input files are follows: run_susieR.zip ENSG00000149084.7.data.zip ENSG00000149084.7.snpList.zip ENSG00000149084.7.ld.zip
I'm in the middle of something else and have to run now ... but here is what I have so far:
library(stringr)
library(susieR)
set.seed(1)
gene <- c("ENSG00000149084.7")
print(gene)
qtlResult <- read.table(file = str_c(gene,".data"),sep = "\t",header = F)
ldMatrix <- as.matrix(read.table(file = str_c(gene,".ld"),sep = " ",header = F))
ldMatrix <- ldMatrix[1:dim(ldMatrix)[1],1:dim(ldMatrix)[1]]
expression_var <- 0.960491
sampleSize <- 204
fitted_rss1 <- susie_rss(bhat = qtlResult$V2,shat = qtlResult$V3,
n = sampleSize,R = ldMatrix,
var_y = expression_var,
L = 10,
estimate_residual_variance = TRUE)
I get
WARNING: XtX is not symmetric; forcing XtX to be symmetric by replacing XtX with (XtX + t(XtX))/2
and negative estimate for residual variance. Then I tried:
isSymmetric(ldMatrix) # FALSE
krig_res = kriging_rss(qtlResult$V2/qtlResult$V3, ldMatrix, sampleSize)
print(krig_res$plot)
I get
WARNING: The matrix R is not positive semidefinite. Negative eigenvalues are set to zero.
Certainly something is wrong with the LD matrix that is supposed to match the genotype; however there is no large difference between observed and expected z-scores. I dont think we have your genotype data to look further into this discrepency.
@gaow @zouyuxin Would you recommend that they try using the methods suggested in the susie-rss bioRxiv paper to adjust their LD matrix?
The report was that there were problems (PIP=0) even with L=1, for which LD should be irrelevant. So can we check L=1?
There is one CS using L=1, the estimated residual variance is 0.291. The CS contains 24 variants (perfectly correlated), each with PIP 0.04. With L=3, it finds 2 CSs, and the estimated residual variance is 0.242. The correlation between the top variant in each CS is 0.72. The variants in CS1 have z scores -21.55285, and the variants in CS2 have z scores around -8. I get the negative estimated residual variance error as I increase L = 4.
With estimated residual variance = FALSE, we identify 1 CS with L = 1 - 10.
@pcarbo the estimated regularization parameter lambda is 0.0027, the adjusted one also gives an error at L = 4 (estimate_residual_variance = TRUE). The smallest eigenvalue for the LD matrix is -1.6. The LD matrix is from plink, and it's a known issue that there are rounding and numerical errors.
@pcarbo @gaow @tianwen0003 Do you know whether the covariate effects are removed from the genotypes in fastqtl?
@zouyuxin good point about the covariates. The genotype data does not remove the covariates from it -- that'd be one reason the kriging plot is not perfect because the z score and LD can be a bit off because of the covarites is not removed from genotypes. In any case, the data used in susie_rss here is not exactly the "in sample LD" by our definition.
So with estimate residual variance =False is everything ok?
On Fri, Jun 10, 2022, 14:11 gaow @.***> wrote:
@zouyuxin https://github.com/zouyuxin good point about the covariates. The genotype data does not remove the covariates from it -- that'd be one reason the kriging plot is not perfect because the z score and LD can be a bit off because of the covarites is not removed from genotypes. In any case, the data used in susie_rss here is not exactly the "in sample LD" by our definition.
— Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/162#issuecomment-1152656312, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANXRRP5ZBK3NUTD7Q2HGGLVOOHNTANCNFSM5W4CSZDA . You are receiving this because you commented.Message ID: @.***>
The original report was that the Pips were all 0 in that case. Can we reproduce that?
On Fri, Jun 10, 2022, 14:24 Matthew Stephens @.***> wrote:
So with estimate residual variance =False is everything ok?
On Fri, Jun 10, 2022, 14:11 gaow @.***> wrote:
@zouyuxin https://github.com/zouyuxin good point about the covariates. The genotype data does not remove the covariates from it -- that'd be one reason the kriging plot is not perfect because the z score and LD can be a bit off because of the covarites is not removed from genotypes. In any case, the data used in susie_rss here is not exactly the "in sample LD" by our definition.
— Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/162#issuecomment-1152656312, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANXRRP5ZBK3NUTD7Q2HGGLVOOHNTANCNFSM5W4CSZDA . You are receiving this because you commented.Message ID: @.***>
The original report was that the Pips were all 0 in that case.
it seems a false alarm according to https://github.com/stephenslab/susieR/issues/162#issuecomment-1144322104
So with estimate residual variance =False is everything ok?
yes, there is no error or pip all =0 when estimate residual variance = FALSE.
Ok, good then it seems all is well. Maybe we should warn against using estimate residual variance =True with summary data, even with in sample ld? Or just not suggest it.... It seems difficult to be precise about the circumstances when it is ok to use?
@pcarbo @gaow @tianwen0003 Do you know whether the covariate effects are removed from the genotypes in fastqtl?
We used the PC1 and PC2 of genotype as covariants for fastqtl
Hi
I just got the same error(2 out of 3000 runs) as #90 when did fine-mapping by susie_rss function.
*Error in susie_suff_stat(XtX = XtX, Xty = Xty, n = n, yty = (n - 1) : Estimating residual variance failed: the estimated value is negative Calls: susie_rss -> susie_suff_stat Execution halted**
when set estimate_residual_variance = FALSE, the function run well but all pip are zero
Here, I provide the all files for you to test ENSG00000149084.7.vcf.gz
ENSG00000149084.7_fastqtl.zip In this file, the second column is the "slope" from fastqtl, the thrid column is the "slope_se" from fastqtl
ENSG00000149084.7_ldMatrix.zip This is the ld matrix calculated from plink --r
the sample number n = 204
var_y = 0.960491