stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
169 stars 42 forks source link

Heteroskedasticity? #198

Open igrabski opened 10 months ago

igrabski commented 10 months ago

Is it possible to fit heteroskedastic models with SuSiE, i.e. to allow different (assume fixed, known) residual variances for different observations? Thanks!

pcarbo commented 10 months ago

@igrabski I'm not quite sure if this is what you are looking for, but in the SuSiE RSS paper we describe approaches for fitting models from summary data, including the situation in which you have a vector b and corresponding standard errors s. These summary-data methods are implemented in the functions susie_suff_stat and susie_rss. This might work for your situation. If you have questions about these methods, feel free to ask.

igrabski commented 10 months ago

Thanks for the fast response! If I understand correctly, in that setting, b represents previous estimates of the coefficients (and corresponding standard errors s), and they are used to approximate the sufficient statistics within a homoscedastic model where all the observations have the same, possibly unknown, residual variance. In my case, I have the original individual observations y and (let's assume) I know the residual variances for each observation. In the homoscedastic case, I know that we can just specify the residual variance in the susie call instead of estimating it as part of the fitting procedure, but is it possible at all to input a vector of residual variances? (Or possible to repurpose the summary data methods to do this?) Thanks again!

pcarbo commented 10 months ago

@igrabski Here's one suggestion: If you could somehow generate summary statistics from your data, then you could potentially apply susie_rss (for example) to the summary statistics. For example there may be a simple method for association testing that works with y our data, and then the outputs from that method can be fed into susie_rss.

stephens999 commented 10 months ago

Assuming the residual variance known seems potentially dangerous (at least I think you would want to avoid understating it...). However, if you assume the residual variance for individual i is sigma^2/w_i^2 (for known weights w_i, with sigma^2 to be estimated) then simple algebra suggests running susie on transformed data, Y <- WY and X <- WX. Here W is the diagonal matrix with ii th element w_i.

[this is analogous to weighted least squares https://en.wikipedia.org/wiki/Weighted_least_squares]