stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
174 stars 44 forks source link

Residual Variance Estimate Failed - Negative Value #90

Closed hsowards closed 4 years ago

hsowards commented 4 years ago

I am using susieR for fine-mapping SNP data and I have encountered a problem with the susie_bhat function. We’re using summary statistics (z score and standard error, and ), an LD matrix (r, not r^2), and omitting variance of y since this is a case-control design. I've been following the code described in the vignette here but when I run the code:

fitted_bhat_standardize <- susie_bhat(bhat = sumstats_clean$Z_fixed, shat = sumstats_clean$SE_fixed, R = Rmatrix, n = nrow(sumstats), L = 6, scaled_prior_variance = 0.1, estimate_residual_variance = TRUE, estimate_prior_variance = FALSE)

I'm getting this error:

Error in susie_ss(XtX = XtX, Xty = Xty, yty = var_y * (n - 1), n = n, : Estimating residual variance failed: the estimated value is negative

I don't have any intensive stats training so I'm not sure where I've gone wrong, if this error is from our data or from the code. Any help would be greatly appreciated!

gaow commented 4 years ago

@hsowards could you run again with estimate_residual_variance = FALSE and see if it gives reasonable result?

We currently recommend using susie_rss() when you work with summary statistics -- it models distribution of z-scores directly. We need to update that vignette, but the documentation for susie_rss should be available if you type ?susieR::susie_rss.

hsowards commented 4 years ago

@gaow I forgot to mention I did successfully run it with estimate_residual_variance = FALSE but there were no PIP values, which is something we want.

I will try using susie_rss(), thank you!

stephens999 commented 4 years ago

can you clarify what you mean by "no PIP values"?

On Wed, Nov 20, 2019 at 9:24 AM hsowards notifications@github.com wrote:

@gaow https://github.com/gaow I forgot to mention I did successfully run it with estimate_residual_variance = FALSE but there were no PIP values, which is something we want.

I will try using susie_rss(), thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stephenslab/susieR/issues/90?email_source=notifications&email_token=AANXRRMIKLLOIGWI5USJDDLQUVJEHA5CNFSM4JPJOG4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEESLDKY#issuecomment-556052907, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANXRRIFJGIVJ77DQSGSDM3QUVJEHANCNFSM4JPJOG4A .

hsowards commented 4 years ago

Apologies, all of the occurrences within the pip vector produced by the susie_bhat() function are zero

zouyuxin commented 4 years ago

@hsowards When you run susie_bhat with estimate_residual_variance = FALSE, how did you set estimate_prior_variance? With estimate_prior_variance = TRUE, it's possible to get pip = 0. But with estimate_prior_variance = FALSE and estimate_residual_variance = FALSE, pip would not be 0.

hsowards commented 4 years ago

@zouyuxin Both estimate_prior_variance and estimate_residual_variance were set to FALSE and pip was 0.

zouyuxin commented 4 years ago

@hsowards Could you provide an example that pip = 0 when both estimate_prior_variance and estimate_residual_variance are FALSE?

hsowards commented 4 years ago

@stephens999 @zouyuxin per @gaow 's recommendation we used susie_rss instead of susie_bhat. This function worked fine without errors on two different genomic regions, even with estimate_residual_variance = T. We noticed the susie_rss function had been updated since we ran it and decided to rerun it with the newer version. Running the newer version of susie_rss now produces the original error (Estimating residual variance failed: the estimated value is negative) with only one of the two regions. The function runs after changing estimate_residual_variance = F. Our concern is that for both of our regions the credible sets produced by the original function and the new function are completely different. Is this expected based on the changes that have been made to susie_rss?

zouyuxin commented 4 years ago

@hsowards Could you check whether the LD matrix is positive semi-definite (all eigenvalues are non-negative)? I update susie_rss, it will check LD matrix if you install the newer version again.

It will be very helpful if you can provide some sample data that we can reproduce the problem and easier to find a fix.

hsowards commented 4 years ago

@zouyuxin The full matrix as well as the cleaned LD matrices for our analyses are not positive semi-definite. We're waiting on approval to send sample data and will send that as soon as we hear back. In the meantime is there anyway we can troubleshoot the matrices?

zouyuxin commented 4 years ago

@hsowards How did you compute the LD matrix? Are there any missing genotypes in your reference panel? If there is no missingness in the reference genotype matrix, the LD matrix should be positive semi-definite.