jessieren / DeepVirFinder

Identifying viruses from metagenomic data by deep learning
Other
116 stars 32 forks source link

q-value calculation issues and anti-conservative p-value distribution #5

Open nikhilg123 opened 5 years ago

nikhilg123 commented 5 years ago

Hi,

When importing the DeepVirFinder predictions and p-values into R to use the qvalue FDR prediction, I ran into an issue with the q value package. The error output is as follows:

"Error in pi0est(p, ...) : ERROR: The estimated pi0 <= 0. Check that you have valid p-values or use a different range of lambda."

I looked in similar issues to this as posted here, and on the qvalue package github page. My concern is with DeepVirFinder's p-value predictions for all my metagenomes - none of them range between 0 and 1, but rather 0 and about ~0.98, and the p-value distributions look quite anti-conservative (my samples have not been enriched for viruses). I was thinking maybe this is why the qvalue package is having issues, due to violations of certain assumptions. The issue of the p-value range being truncated was not an issue with the VirFinder p-value predictions, but I have had other challenges with VirFinder, so I have moved to DeepVirFinder. I have attached images of p-value histograms, for the same metagenome, but from predictions of DeepVirFinder and Virfinder, respectively.

pvalue_hist_DeepVirFinder.pdf pvalue_hist_Virfinder_pdf.pdf

Please let me know if you have any suggestions - your time and input are greatly appreciated.

Best, Nikhil

nikhilg123 commented 5 years ago

In the interim, I am going forward with the assumption that the p-values are fine, as they don't seem oddly truncated from the distributions. I am also using the "pi0.meth="bootstrap" flag with the qvalue package, which is supposedly more conservative, and is working fine.

Best, Nikhil