q-values for heavily-weighted tests are often much lower than the original p-values

Thanks for producing this much-needed package. I have one concern: The lm_qvalue function often lowers the p-values of heavy-weighted tests, sometimes to the point of approaching or achieving significance, even when the original p-value was well above 0.50.

I've attached a sample of my data, as well as the R code use to obtain them, to demonstrate the issue. The data were obtained with the DESeq2 package, followed by lm_qvalue using per-gene read counts as weights. Some heavy-weighted genes behave as expected (e.g. ENSGALG00010000588 has a low p-value that is only adjusted slightly upward), but some heavy-weighted genes demonstrate the above-mentioned issue (e.g. ENSGALG00010020480 has a p-value of 0.737 that is lowered to 0.0705).

I understand that methods of FDR control that estimate a prior number of refuted null hypotheses (such as swfdr and Benjamini-Yekutieli) sometimes result in q-values that are lower than original p-values, but I did not expect the reduction to be as large as what I am seeing with swfdr. Any explanation would be much appreciated!

swfdr.csv coldata <- read.csv("/dfs6/pub/fweghors/CREVASSE_ref/swishcoldata.csv") library("tximeta") library("DESeq2") library(SummarizedExperiment) y <- tximeta(coldata) y <- summarizeToGene(y) y$condition <- factor(y$condition, levels=c("VEH","DEVD")) y$batch <- factor(y$batch, levels=c(1, 2)) dds <- DESeqDataSet(y, design = ~ condition + batch) dds <- DESeq(dds) res <- results(dds, name="condition_DEVD_vs_VEH") genes <- dim(dds)[1] counts <- assay(dds)[1:genes,1:10] weights <- rowSums(counts) res <- cbind(res, weights) pvalues <- res$pvalue library(swfdr) qv <- lm_qvalue(pvalues, X=weights) padjust <- qv$qvalues res_adj <- cbind(res,padjust)

Hi!

A few points - though I'm not 100% sure any will totally solve the problem :/

In general the most common way for the q-values to be lower (including much lower) than their respective p-values is if the prior probability that a hypothesis is null is very low. For example if almost all the p-values from a study are very small, then you will get a \pi_0 near 0 with methods' like those of Storey, ours, etc.. and if you have any large p-values the corresponding q-values will still be small.
In the code above, it looks like the weights you are using are based on the average count. Those weights then correspond to different prior probabilities \pi_0 | weight. These prior probabilities are likely very low for some values of the weights and so you get much smaller q-values than p-values in those cases.

What I would do next to diagnose is plot weight vs p-value and weight vs q-value. I would also plot q-value vs p-value, but colored by weight. These diagnostics should help you discover which values of the weights have very low prior probabilities of the null, leading to this behavior.

leekgroup / swfdr

q-values for heavily-weighted tests are often much lower than the original p-values #5