stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 46 forks source link

Estimated residual variance is negative #188

Open karenfeng opened 1 year ago

karenfeng commented 1 year ago

When running Susie RSS with the following command:

fitted_rss <- susie_rss(
   bhat = eqtl$beta,
   shat = eqtl$se,
   n = num_samples,
   R = cor_matrix,
   estimate_residual_variance = TRUE)

I encounter the following issue:

Error in susie_suff_stat(XtX = XtX, Xty = Xty, n = n, yty = (n - 1) *  :
  Estimating residual variance failed: the estimated value is negative

I am using an in-sample LD matrix generated with cor(), so it is symmetric positive semi-definite (as discussed in https://github.com/stephenslab/susieR/issues/91).

The issue goes away if I change estimate_prior_method from the default (optim) to EM or simple.

Could you clarify how to resolve this problem? I also saw the suggestion that estimate_residual_variance should always be FALSE as in https://github.com/stephenslab/susieR/issues/162#issuecomment-1152681354, but I'm not sure if this is a universal suggestion.

karenfeng commented 1 year ago

Here is a full repro, with simplified input files:

library(data.table)
library(susieR)

set.seed(1)
raw_file <- "raw.txt"
raw_matrix <- fread(raw_file, sep=" ", header=FALSE)
num_samples <- nrow(raw_matrix)
cor_matrix <- cor(raw_matrix)

eqtl_file <- "eqtl.txt"
eqtl <- fread(eqtl_file, sep=" ", header=TRUE)

fitted_rss <- susie_rss(
   bhat = eqtl$beta,
   shat = eqtl$se,
   n = num_samples,
   R = cor_matrix,
   estimate_residual_variance = TRUE,
   verbose = TRUE)

eqtl.txt raw.txt

pcarbo commented 1 year ago

@karenfeng Since you have the full (individual-level) data, you could also try running susie with estimate_residual_variance = TRUE? (It looks like you didn't not provide it here so I cannot run it.) Then you could compare against susie_rss with estimate_residual_variance = FALSE. You might find that the results are similar. In this example, with estimate_residual_variance = FALSE, running susie_rss identified 1 CS. Hope that helps.

pcarbo commented 1 year ago

Note I made a couple small changes to your example above to make it reproducible.

pcarbo commented 1 year ago

@karenfeng I can also confirm that setting estimate_prior_method = "EM" gets rid of the error, although I don't have a good explanation for this, because the error occurs with estimating the residual variance, not the prior variance. If you find that this settings works better for you, please go ahead and use it!