StoreyLab / qvalue

R package to estimate q-values and false discovery rate quantities.
109 stars 36 forks source link

Error in smooth.spline(lambda, pi0, df = smooth.df) : missing or infinite values in inputs are not allowed #19

Closed montenegrina closed 1 year ago

montenegrina commented 4 years ago

Hello,

I have p values like this:

 library(qvalue)
 pvals<-c(6.919239e-02,1.073784e-01,1.218613e-01,1.586202e-01,
 1.370340e-01,3.452574e-02,2.545619e-01,1.676715e-02,8.571197e-01,
 8.649025e-01,1.777414e-02,6.801867e-01,6.873085e-01,
 1.276566e-01,5.907002e-02,2.343207e-02,2.078125e-02,6.404511e-02,
 3.306593e-02,9.411259e-04,1.038989e-03,4.734674e-05,
 3.422489e-05,5.606264e-05,6.322817e-02,1.291268e-02,3.452596e-03,
 1.770753e-01,3.285821e-01,2.292055e-02,5.503168e-02,
 6.940048e-01,3.638889e-03,6.799640e-01,1.301045e-02,6.794010e-01,
 2.339327e-02,1.038529e-01,3.137687e-04,2.148050e-02,
 1.783068e-01,1.764518e-01,6.386686e-03,1.670062e-02,3.220291e-01,
 5.568613e-01,8.886102e-01,5.031040e-01,3.760079e-01,
 6.638034e-02,9.419648e-03,5.885266e-03,1.539809e-02,5.296551e-03,
 2.425230e-02,5.023091e-01,4.547284e-03,1.850796e-01,
 9.389289e-02,6.544873e-03,3.031956e-03,7.772671e-03,9.073974e-03,
 9.118352e-02,4.075408e-04,6.902206e-01,6.929767e-02,
 1.897121e-02,6.693074e-02,1.111308e-02,1.286147e-02,4.515834e-02,
 8.886941e-01,8.891051e-01,3.792846e-01,5.368898e-01,
 2.323894e-01,3.220141e-01,7.320883e-02,9.642521e-03,6.024415e-01,
 2.459322e-02,2.873351e-01,8.477168e-01,1.351068e-02,
 1.053550e-01,4.812686e-01,1.404957e-01,9.835912e-02,4.373995e-01,
 8.803856e-02)

when I do:

qval_obj=qvalue(pvals)
Error in smooth.spline(lambda, pi0, df = smooth.df) : 
  missing or infinite values in inputs are not allowed

I installed qvalue via:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("qvalue")

I use r-3.6.1

Please assist, Ana

ajbass commented 4 years ago

There are not enough p-values to estimate the proportion of truly null tests. You should fix pi0=1 in the qvalue function:

qout <- qvalue(pvals, pi0 = 1)
montenegrina commented 4 years ago

there is 91 p values

How many p values is enough?

Also with

qval_obj=qvalue(pvals, pi0=1) pi1=1-qval_obj$pi0 pi1 [1] 0

I am getting TPR 0 which can not be

On Mon, Oct 28, 2019 at 7:14 PM Andrew Bass notifications@github.com wrote:

There are not enough p-values to estimate the proportion of truly null tests. You should fix pi0=1 in the qvalue function:

qout <- qvalue(pvals, pi0 = 1)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/StoreyLab/qvalue/issues/19?email_source=notifications&email_token=ACF3RTERGESRYJZNYROGD7LQQ5577A5CNFSM4JGBRBU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECOZ6PQ#issuecomment-547200830, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACF3RTCBIZRC67EJSRQU2ADQQ5577ANCNFSM4JGBRBUQ .

montenegrina commented 4 years ago

Hi,

I was wondering if there is something I am doing wrong when trying to calculate True positive rate, TPR with your function this is what I am doing:

head(qq) chr pos gene_id pval_nominal pval_ret META 1: chr1 54490 ENSG00000227232 0.608495 0.783778 0.7733204 2: chr1 58814 ENSG00000227232 0.295211 0.897582 0.6970567 3: chr1 60351 ENSG00000227232 0.439788 0.867959 0.7525581 4: chr1 61920 ENSG00000227232 0.319528 0.601809 0.4407018 5: chr1 63671 ENSG00000227232 0.237739 0.988039 0.8626555 6: chr1 64931 ENSG00000227232 0.276679 0.907037 0.6971364

library(qvalue) pvals=qq$META qval_obj=qvalue(pvals) #is false discovery rate pi1=1-qval_obj$pi0 #TPR

pi1 [1] 0.08827036

It is very unlikely that TPR would be this small, can you please advise?

I do have 59981 META p values in this data frame.

Thanks Ana

On Mon, Oct 28, 2019 at 7:18 PM Ana Marija sokovic.anamarija@gmail.com wrote:

there is 91 p values

How many p values is enough?

Also with

qval_obj=qvalue(pvals, pi0=1) pi1=1-qval_obj$pi0 pi1 [1] 0

I am getting TPR 0 which can not be

On Mon, Oct 28, 2019 at 7:14 PM Andrew Bass notifications@github.com wrote:

There are not enough p-values to estimate the proportion of truly null tests. You should fix pi0=1 in the qvalue function:

qout <- qvalue(pvals, pi0 = 1)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

montenegrina commented 4 years ago

Hi Andrew,

I have 1493313 p values in a data frame:

> head(dt)
     pvals
1 0.0084956486
2 0.0012681537
3 0.0021218873
4 0.0001551133
5 0.0001894240

I am trying to use qvalue package to calculate True Positive Rate. (TPR)

I am doing this and getting error:

library(qvalue)
pvals=dt$pvals
qval_obj=qvalue(pvals) #is false discovery rate

Error in smooth.spline(lambda, pi0, df = smooth.df) :
  missing or infinite values in inputs are not allowed

It this would work I would calculate TPR via:

pi1=1-qval_obj$pi0 #TPR

I don't have any NAs or infinite values there, and

> min(dt$pvals)
[1] 3.988883e-156
> max(dt$pvals)
[1] 0.8746981
> sapply(dt,class)
pvals
"numeric"

Can you please help me resolve this issue?

Thanks Ana

On Tue, Oct 29, 2019 at 2:18 PM Ana Marija sokovic.anamarija@gmail.com wrote:

Hi,

I was wondering if there is something I am doing wrong when trying to calculate True positive rate, TPR with your function this is what I am doing:

head(qq) chr pos gene_id pval_nominal pval_ret META 1: chr1 54490 ENSG00000227232 0.608495 0.783778 0.7733204 2: chr1 58814 ENSG00000227232 0.295211 0.897582 0.6970567 3: chr1 60351 ENSG00000227232 0.439788 0.867959 0.7525581 4: chr1 61920 ENSG00000227232 0.319528 0.601809 0.4407018 5: chr1 63671 ENSG00000227232 0.237739 0.988039 0.8626555 6: chr1 64931 ENSG00000227232 0.276679 0.907037 0.6971364

library(qvalue) pvals=qq$META qval_obj=qvalue(pvals) #is false discovery rate pi1=1-qval_obj$pi0 #TPR

pi1 [1] 0.08827036

It is very unlikely that TPR would be this small, can you please advise?

I do have 59981 META p values in this data frame.

Thanks Ana

On Mon, Oct 28, 2019 at 7:18 PM Ana Marija sokovic.anamarija@gmail.com wrote:

there is 91 p values

How many p values is enough?

Also with

qval_obj=qvalue(pvals, pi0=1) pi1=1-qval_obj$pi0 pi1 [1] 0

I am getting TPR 0 which can not be

On Mon, Oct 28, 2019 at 7:14 PM Andrew Bass notifications@github.com wrote:

There are not enough p-values to estimate the proportion of truly null tests. You should fix pi0=1 in the qvalue function:

qout <- qvalue(pvals, pi0 = 1)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

montenegrina commented 4 years ago

I did histogram of my data and it appears I do have too many small p values

is there is anything I can do here?

On Sat, Nov 2, 2019 at 7:46 PM Ana Marija sokovic.anamarija@gmail.com wrote:

Hi Andrew,

I have 1493313 p values in a data frame:

> head(dt)
     pvals
1 0.0084956486
2 0.0012681537
3 0.0021218873
4 0.0001551133
5 0.0001894240

I am trying to use qvalue package to calculate True Positive Rate. (TPR)

I am doing this and getting error:

library(qvalue)
pvals=dt$pvals
qval_obj=qvalue(pvals) #is false discovery rate

Error in smooth.spline(lambda, pi0, df = smooth.df) :
  missing or infinite values in inputs are not allowed

It this would work I would calculate TPR via:

pi1=1-qval_obj$pi0 #TPR

I don't have any NAs or infinite values there, and

> min(dt$pvals)
[1] 3.988883e-156
> max(dt$pvals)
[1] 0.8746981
> sapply(dt,class)
pvals
"numeric"

Can you please help me resolve this issue?

Thanks Ana

On Tue, Oct 29, 2019 at 2:18 PM Ana Marija sokovic.anamarija@gmail.com wrote:

Hi,

I was wondering if there is something I am doing wrong when trying to calculate True positive rate, TPR with your function this is what I am doing:

head(qq) chr pos gene_id pval_nominal pval_ret META 1: chr1 54490 ENSG00000227232 0.608495 0.783778 0.7733204 2: chr1 58814 ENSG00000227232 0.295211 0.897582 0.6970567 3: chr1 60351 ENSG00000227232 0.439788 0.867959 0.7525581 4: chr1 61920 ENSG00000227232 0.319528 0.601809 0.4407018 5: chr1 63671 ENSG00000227232 0.237739 0.988039 0.8626555 6: chr1 64931 ENSG00000227232 0.276679 0.907037 0.6971364

library(qvalue) pvals=qq$META qval_obj=qvalue(pvals) #is false discovery rate pi1=1-qval_obj$pi0 #TPR

pi1 [1] 0.08827036

It is very unlikely that TPR would be this small, can you please advise?

I do have 59981 META p values in this data frame.

Thanks Ana

On Mon, Oct 28, 2019 at 7:18 PM Ana Marija sokovic.anamarija@gmail.com wrote:

there is 91 p values

How many p values is enough?

Also with

qval_obj=qvalue(pvals, pi0=1) pi1=1-qval_obj$pi0 pi1 [1] 0

I am getting TPR 0 which can not be

On Mon, Oct 28, 2019 at 7:14 PM Andrew Bass notifications@github.com wrote:

There are not enough p-values to estimate the proportion of truly null tests. You should fix pi0=1 in the qvalue function:

qout <- qvalue(pvals, pi0 = 1)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

shackett commented 4 years ago

@montenegrina @ajbass

I've run into this problems a couple times as well. The problem is that there are no p-values greater than one of the values of lambda used to fit the smoothing spline to estimate pi0.

Here is a quick work-around I found:

p_values <- term_data$p.value
q_values <- try(qvalue::qvalue(p_values)$qvalues, silent = TRUE)

if ("try-error" %in% class(q_values)) {
    # if qvalue fails this is probably because there are no p-values greater than
    # 0.95 (the highest lambda value
    # if so add a single p-value of 1 to try to combat the problem
    q_values = qvalue::qvalue(c(p_values, 1))$qvalues
    q_values = q_values[-length(q_values)]

This generally occurs when there are ludicrously small p-values (like p-values from an intercept). Having some default behavior which doesn't require interpreting a cryptic error message and manually setting pi0 would be helpful.

mdmanurung commented 3 years ago

You are a life-saver @shackett. Thank you for your recommendation.

ursadhip commented 2 years ago

Thanks a lot, @shackett for the quick work-around.

I am analyzing results from a program DeepVirFinder which recommends converting its output p-values to q-value's to control the false discovery rate.

As I am a newbie in the R applications, would you please elaborate on the commented lines in this workaround?

Regards