Error calculating Q - Githubissues

swvanderlaan commented 6 years ago

Hi,

I'm using your package, but I get the error below. The smallest p-value is e-80. How can I correct this error?

Thanks!

RESULTS$Q = qvalue(RESULTS$Nominal_P)$qvalues
Error in pi0est(p, ...) : 
  ERROR: maximum p-value is smaller than lambda range. Change the range of lambda or use qvalue_truncp() for truncated p-values.

ajbass commented 6 years ago

What’s the maximum p value? This usually happens when you have p values not between 0 and 1 (for example, your p values could be between 0 and .5)

swvanderlaan commented 6 years ago

Oh. That could easily be the case. I'm doing an expression quantitative trait loci (eQTL) analysis of many genes - and for some genes the results could be highly significant, such that most p-values are approaching 0, and few if any are larger than 0.5.

Is this error similar in origin as the following?

Error in smooth.spline(lambda, pi0, df = smooth.df) :   missing or infinite values in inputs are not allowed

What is you advice? Perhaps in such cases, I should not apply this particular form of FDR correction as results are highly significant?

Thanks!

ajbass commented 6 years ago

It's hard to give useful advice without analyzing the data but the q-value procedure only works for well behaved p-values (see http://varianceexplained.org/statistics/interpreting-pvalue-histogram/) .... If a majority of your p-values are highly significant, maybe the best strategy is to fix pi0=1 (this will apply the BH procedure and is conservative):

q.out <- qvalue(pvalues, pi0 = 1)

Here's a demo:

pvalues <- runif(1000, min = 0.0001, max = 0.01)
q.out <- qvalue(pvalues, pi0 = 1)

Cheers

swvanderlaan commented 6 years ago

I understand. Thanks for the advice!

Arthfael commented 6 years ago

This issue seems to occur even if all P values are valid numbers between 0 and 1. I read on another error that it seems to occur for "truncated" P-value distributions. Is qvalue making some assumptions about P-value distribution? Can it be applied to any P values vector regardless of which test generated them (e.g. Bayesian t-test, permutations test, hypergeometric test rather than the classic Student's t-test)?

ajbass commented 6 years ago

Can it be applied to any P values vector regardless of which test generated them: Yes. The link above explains in more detail about how your p-value histogram should look for qvalue to work.

Can you show a reproducible example?

Arthfael commented 6 years ago

Hi, so I was scrutinizing the faulty P values vector, and I may have an explanation. Most of my P values are 0. The deeper reason is I thought I was mapping proteins to KEGG pathway terms, but I was actually mapping to KEGG terms which cover just closely related proteins - like all proteoforms of a particular enzyme. Since this is a proteomics experiment, and I am dealing with protein groups (which cover essentially the same thing - related proteins) and basing my counts on all proteins in each group, not just the leading protein, for most terms the count in my observed proteome was the same or very close to that for my parent proteome: as soon as I identify peptides for a protein linked to a term, all related protein IDs are listed in the protein group, hence I capture all instances of the term in the parent database. Need to fix this code...

HenrikEckermann commented 5 years ago

Hi @ajbass,

I tried what you recommended above to get the Hochberg procedure. Just to doublecheck I also used p.adjust. I assumed that

qvalue(pvalues, pi0 = 1)$qvalue
p.adjust(pvalues, method = "hochberg")

yield the same results but they differ. Do you know why and whether one method is more reliable than the other?

ajbass commented 5 years ago

To use BH procedure in p.adjust, you need to set method = "BH":

library(qvalue)
pvals <- runif(1000)
qvals <- qvalue(pvals, pi0 = 1)
qvals2 <- p.adjust(pvals, method = "BH")
all.equal(qvals$qvalues, qvals2)

StoreyLab / qvalue

Error calculating Q #11