Open fweghorst opened 2 years ago
Hi!
A few points - though I'm not 100% sure any will totally solve the problem :/
What I would do next to diagnose is plot weight vs p-value and weight vs q-value. I would also plot q-value vs p-value, but colored by weight. These diagnostics should help you discover which values of the weights have very low prior probabilities of the null, leading to this behavior.
Thanks for producing this much-needed package. I have one concern: The lm_qvalue function often lowers the p-values of heavy-weighted tests, sometimes to the point of approaching or achieving significance, even when the original p-value was well above 0.50.
I've attached a sample of my data, as well as the R code use to obtain them, to demonstrate the issue. The data were obtained with the DESeq2 package, followed by lm_qvalue using per-gene read counts as weights. Some heavy-weighted genes behave as expected (e.g. ENSGALG00010000588 has a low p-value that is only adjusted slightly upward), but some heavy-weighted genes demonstrate the above-mentioned issue (e.g. ENSGALG00010020480 has a p-value of 0.737 that is lowered to 0.0705).
I understand that methods of FDR control that estimate a prior number of refuted null hypotheses (such as swfdr and Benjamini-Yekutieli) sometimes result in q-values that are lower than original p-values, but I did not expect the reduction to be as large as what I am seeing with swfdr. Any explanation would be much appreciated!
swfdr.csv
coldata <- read.csv("/dfs6/pub/fweghors/CREVASSE_ref/swishcoldata.csv")
library("tximeta")
library("DESeq2")
library(SummarizedExperiment)
y <- tximeta(coldata)
y <- summarizeToGene(y)
y$condition <- factor(y$condition, levels=c("VEH","DEVD"))
y$batch <- factor(y$batch, levels=c(1, 2))
dds <- DESeqDataSet(y, design = ~ condition + batch)
dds <- DESeq(dds)
res <- results(dds, name="condition_DEVD_vs_VEH")
genes <- dim(dds)[1]
counts <- assay(dds)[1:genes,1:10]
weights <- rowSums(counts)
res <- cbind(res, weights)
pvalues <- res$pvalue
library(swfdr)
qv <- lm_qvalue(pvalues, X=weights)
padjust <- qv$qvalues
res_adj <- cbind(res,padjust)