Closed aidarripoll closed 1 year ago
Since we're not interested in the contribution of the Batch
variable, and that's why we control for it in the DEA model, I guess what we could do is first regress out the Batch
variable effect from the gene expression, and afterward, perform the variance partition analysis on the residuals using the following model ~Sex+Age
. However, some of the residuals could be negative, and I'm not sure whether dreamlet::fitVarPart()
can handle it..
Thanks again, Aida
1) Use ~(1|Sex)+Age+(1|Batch)
. variancePartition
works best when categorical variables are modeled as a random effects. It't not an issue that this formula isn't identical the the differential expression formula.
2) You could do that, but you'd have to use variancePartition::fitExtractVarPartModel()
directly.
But using (1) is easier, and that is the workflow I designed dreamlet for
Gabriel
Dear all,
I understand the
dreamlet::fitVarPart()
is internally using the variancePartition package, which uses linear and linear mixed models to quantify the contribution of multiple sources of expression variation at the gene-level.When calling
dreamlet::fitVarPart()
using a mixed-model formula like this~Sex + Age + (1|Batch)
, where Sex and Batch are categorical and Age is continuous, I encountered the following error:_"Error in run_model_check_mixed(fit, showWarnings, dream, colinearityCutoff, : Categorical variables modeled as fixed effect: Gender Must model either all or no categorical variables as random effects here"_
From this, I understand the function can only handle mixed formulas with all categorical variables treated as fixed or random factors. Likewise, using the following formulas no longer gives an error:
~Sex+Age+Batch
or~(1|Sex)+Age+(1|Batch)
.Here you can find the cross-table showing the number of cells per Batch (columns, which are the sequencing dates) and Sex (rows):
In the differential expression analysis, we're treating Batch as a random effect and Sex as a fixed effect, but we're only interested in the Sex or Age coefficients/effects, but we still want to control for Batch. Hence, I'm wondering which option would be better (or correct) for the variance partition analysis, in order to be the most similar to the differential expression analysis set-up:
~Sex+Age+Batch
--> all categorical variables as fixed factors~(1|Sex)+Age+(1|Batch)
--> all categorical variables as random factors~Sex+Age
--> in this case, the Batch contribution would be part of the residualsThanks a lot! Aida