GabrielHoffman / variancePartition

Quantify and interpret divers of variation in multilevel gene expression experiments
http://gabrielhoffman.github.io/variancePartition/
60 stars 14 forks source link

Adjusting R2 #86

Closed MEladawi closed 9 months ago

MEladawi commented 1 year ago

Hello--

Thanks for the great package.

I just had a question, why the calculation of the variance partition using R2 does does include adjusting the R2 values?

Thanks! Mahmoud

GabrielHoffman commented 1 year ago

Hi Mahmoud, This is a good question that I hadn't thought much about before.

When sample size is fixed, using adjusted R^2 summarizes the total variance explained by all predictors in a way accounts for the degrees of freedom of the model fit. It is equivalent to estimating the residual variance as RSS / (n-p) using a bias adjustment instead of the standard MLE RSS / n. Importantly, when sample size increases with p fixed, these estimates become equivalent.

Moving beyond the adjusted R^2 of the full model, you are asking about partitioning the adjusted variance instead of the MLE estimate of the variance. Let's note that with increasing sample size, the results will be equivalent. But for small sample size it could make a difference.

This is an interesting idea, and seems doable. I think it is equivalent to replacing the MLE estimate of residual variance with the bias adjusted estimated. I'll play around with this and see if it is useful.

Best, Gabriel

GabrielHoffman commented 1 year ago

See links for thoughts on adjusted R^2 when returning to this in the future