Closed jmichuda closed 5 months ago
Was this resolved? I’m traveling until Monday, so I likely won’t be able to answer until then
Hi Gabriel, Thanks for your response. I haven't quite figured it out yet. Essentially what I'm trying to do is something analogous to the following workflow in base limma/voom:
# create design matrix with no linear combinations
form <- ~ dose_factor + day_factor + day_factor:dose_factor
design <- model.matrix(form, diffex_meta)
linCombos <- caret::findLinearCombos(design)
design <- design[, -unique(unlist(linCombos$linearCombos))]
fit <- voomLmFit(gExpr, design)
fit <- eBayes(fit)
I'm able to remove columns with linear combinations from the model matrix prior to fitting the model. Unfortunately it doesn't seem like variancePartition::dream()
supports a similar syntax of inputting a design matrix.
Let's consider the design matrix from your data:
dose = c("Control", "Control", "D1", "D1", "D2", "D2")
timePoint = c("Control", "Control", "Day1", "Day2", "Day1", "Day2")
df = data.frame(dose, timePoint)
model.matrix(~ dose + timePoint, df)
#> (Intercept) doseD1 doseD2 timePointDay1 timePointDay2
#> 1 1 0 0 0 0
#> 2 1 0 0 0 0
#> 3 1 1 0 1 0
#> 4 1 1 0 0 1
#> 5 1 0 1 1 0
#> 6 1 0 1 0 1
This is not full rank, so these parameters can't be estimated together in the model.
In my experience, manually dropping columns from the design matrix is problematic because the formula is no longer interpretable after you drop columns. Instead of addressing this at an engineering level, we can take this result to mean that the regression model is overly complex and estimates too many parameters for the data we have. It means that the effect of dose and time can't be decoupled in this study design.
Instead, you should recode your meta-data matrix and create a design matrix that is full rank.
Basically I'd like to build a formula of
~ dose:timepoint + (1 | subject) + (1 | plate)
The control samples have no dose or timepoint information, which triggers the following error when running
dream()
:I'm not interested in modeling the fixed effect for the control samples, I just want them to be used to help estimate the random effect of plate. If variancePartition::dream() took a design matrix as input, we could avoid this by removing repetitive columns. But it seems like it only supports formula inputs. Any advice on how to get around this issue? Ideally would like to avoid 2-part models.
I've also tried modifying the lmer control to just drop columns and only pass in the contrasts that I care about (contrasts made using
makeContrastsDream
) but it appears this functionality doesn't exist:And also tried using an indicator value in the formula:
But then lme4 fails to identify the fixed effect in the model.
The core issue is that variancePartition's dream module creates the full contrast matrix here, which ultimately relies on
lme4::no_bars()
to parse the formula and the metadata for the fixed effects model matrix. I could modify the .getFixefNames function to exclude fixed effects from the model matrix that are colinear or duplicates of other variables.