Closed Jayanth-Sreekanth closed 1 year ago
If you fit a model with only flxed effects, then dream()
should give you identical results to running lmFit()
followed by `eBayes(). 1.24.1 is pretty old, so I might have fixed the issue since then.
If you send me a minimal reproducible example, I can see if there is a bug
Best, Gabriel
Hi Gabriel,
I've shared a small reproducible example below.
As you can see, the p.value produced in the topTable and F.p.value in the output of dream function is slightly different. The differences between the p.value and F.p.value are sometimes large while working on other dataframes.
Why is this? Have you rectified this in the latest releases?
Also wanted to confirm that fit$p.value
is not corrected for multiple testing right?
Regards, Jay
library(variancePartition)
#> Warning: package 'variancePartition' was built under R version 4.1.3
#> Loading required package: ggplot2
#> Warning: package 'ggplot2' was built under R version 4.1.3
#> Loading required package: limma
#> Warning: package 'limma' was built under R version 4.1.3
#> Loading required package: BiocParallel
data(varPartData)
form <- ~ Batch + (1|Individual) + (1|Tissue)
fit = variancePartition::dream(geneExpr[1:100,], formula = form, data = info)
#> Dividing work into 5 chunks...
#>
#> Total:7 s
fit = variancePartition::eBayes(fit)
tt <- variancePartition::topTable(fit)
#> Removing intercept from test coefficients
p.val <- as.data.frame(fit$p.value)
p.val$F.p.value <- fit$F.p.value
tt$P.Value
#> [1] 0.006912587 0.010287949 0.026494071 0.028667857 0.041205520 0.047782615
#> [7] 0.068617212 0.084784797 0.086718583 0.099380955
p.val[rownames(tt),'F.p.value']
#> [1] 0.006907668 0.010281684 0.026483626 0.028657013 0.041192802 0.047769136
#> [7] 0.068601995 0.084768727 0.086702432 0.099364372
When dream()
is fit with only fixed effects, then it produces exactly the same result as limma::lmFit()
:
library(variancePartition)
data(varPartData)
form <- ~ Batch
fit <- dream(geneExpr[1:100,], formula = form, data = info)
fit <- eBayes(fit)
topTable(fit, coef="Batch2", number=3)
#> logFC AveExpr t P.Value adj.P.Val B
#> gene10 1.897490 -2.3673775 2.028912 0.04444815 0.973476 -4.580493
#> gene75 -1.692250 -0.4512581 -1.740355 0.08409242 0.973476 -4.585531
#> gene21 1.546069 -6.3790973 1.677117 0.09585092 0.973476 -4.586541
design = model.matrix(form, info)
fit2 <- lmFit(geneExpr[1:100,], design)
fit2 <- eBayes(fit2)
topTable(fit2, coef="Batch2", number=3)
#> logFC AveExpr t P.Value adj.P.Val B
#> gene10 1.897490 -2.3673775 2.028912 0.04444815 0.973476 -4.580493
#> gene75 -1.692250 -0.4512581 -1.740355 0.08409242 0.973476 -4.585531
#> gene21 1.546069 -6.3790973 1.677117 0.09585092 0.973476 -4.586541
With a random effect is included (i.e. (1|Individual)
), then dream()
used a linear mixed model that is different. This is the target use case for dream()
, the fixed-effect only case was included for convenience.
In your code above you are reading internal data from the fit
object when you are using fit$F.p.value
and fit$p.value
. Avoid this. The only valid way to extract results from these tests is with topTable()
Gabriel
Alright, thanks!
Hi, thank you for the nice package.
I am running a mixed model similar to the one showed in the tutorial of variancePartition
Though having multiple variables (Disease, Age and Sex), I am interested in looking at the individual P.values of each variable. For this, I am referring to p.value matrix in the output of dream function. My question is, are they adjusted.p.values or not?
For all the genes, I notice the difference between P.Value in the topTable and F.p.value from the output of dream function. What's the reason behind it?
variancePartition version = 1.24.1