Open NicoleEO opened 2 years ago
Issue #174 is related to this. User receives the "object 'Group' not found" where Group is fixed effect column
message. Could we consider adding in a "no random effects present" error message?
If there is still interest in adding support for non-mixed model analyses, I have created a modified version of mixedModelDE to fit simple linear models for a single-slide analysis I was working on: https://github.com/lardenoije/GeomxTools/blob/master/R/modelDE.R Perhaps it can be of use.
Hi @lardenoije, this could be very useful for me since I'm following this vignette: https://bioconductor.org/packages/devel/workflows/vignettes/GeoMxWorkflows/inst/doc/GeomxTools_RNA-NGS_Analysis.html#7_Differential_Expression
However I only have a single GeoMx slide to analyze so the mixed-effect model part is throwing an error. Would you be able to show how your modelDE.R function would substitute into this section of the vignette?
Within Slide Analysis: Glomeruli vs Tubules
One informative exploration is to study differences between morphological structures. In this example, we can study differential expression between glomeruli and tubules. We will focus on the diseased kidney tissues. Because we are comparing structures that co-exist within the a given tissue we will use the LMM model with a random slope. Morphological structure (Region) is our test variable. We control for tissue subsampling with slide name using a random slope and intercept; the intercept adjusts for the multiple regions placed per unique tissue, since we have one tissue per slide. If multiple tissues are placed per slide, we would change the intercept variable to the unique tissue name (ex: tissue name, Block ID, etc).
In this analysis we save log2 fold change estimates and P-values across all levels in the factor of interest. We also apply a Benjamini-Hochberg multiple test correction.
convert test variables to factors pData(target_demoData)$testRegion <- factor(pData(target_demoData)$region, c("glomerulus", "tubule")) pData(target_demoData)[["slide"]] <- factor(pData(target_demoData)[["slide name"]]) assayDataElement(object = target_demoData, elt = "log_q") <- assayDataApply(target_demoData, 2, FUN = log, base = 2, elt = "q_norm")
run LMM: formula follows conventions defined by the lme4 package results <- c() for(status in c("DKD", "normal")) { ind <- pData(target_demoData)$class == status mixedOutmc <- mixedModelDE(target_demoData[, ind], elt = "log_q", modelFormula = ~ testRegion + (1 + testRegion | slide), groupVar = "testRegion", nCores = parallel::detectCores(), multiCore = FALSE)
# format results as data.frame
r_test <- do.call(rbind, mixedOutmc["lsmeans", ])
tests <- rownames(r_test)
r_test <- as.data.frame(r_test)
r_test$Contrast <- tests
# use lapply in case you have multiple levels of your test factor to
# correctly associate gene name with it's row in the results table
r_test$Gene <-
unlist(lapply(colnames(mixedOutmc),
rep, nrow(mixedOutmc["lsmeans", ][[1]])))
r_test$Subset <- status
r_test$FDR <- p.adjust(r_test$`Pr(>|t|)`, method = "fdr")
r_test <- r_test[, c("Gene", "Subset", "Contrast", "Estimate",
"Pr(>|t|)", "FDR")]
results <- rbind(results, r_test)
} Note that the example uses nCores = parallel:detectCores() and multiCore = FALSE to implement the parallel package clustering of all available cores. If working in a Windows environment use multicore = FALSE. If in a UNIX-based environment setting multicore = TRUE will parallelize using the mcapply package. If you do not want to use all available cores, change the nCores variable to the desired number to use.