im3sanger / dndscv

dN/dS methods to quantify selection in cancer and somatic evolution
GNU General Public License v3.0
212 stars 48 forks source link

Zero values in some columns #90

Closed FL512 closed 1 year ago

FL512 commented 1 year ago

Hi, thank you for sharing such a useful tool for dNdS analysis. This is really helpful.

I am analyzing my data using dNdScv and encountered zero values in pallsubs_cv, qall_subs_cv, pgloval_cv, and qgloval_cv for the TP53 gene. Intriguingly, your web site also shows that there are zero values to TP53 gene. http://htmlpreview.github.io/?http://github.com/im3sanger/dndscv/blob/master/vignettes/dNdScv.html

Will you please let me know why this happens? Does this because of too low p and q values? so that PC cannot appropriately display on the monitor or the summary table?

im3sanger commented 1 year ago

Hello,

Most p-values and q-values in dNdScv are calculated using the chi-square distribution (specifically using the pchisq function in R). This is the case for both the likelihood ratio tests (for pmis, ptrunc and pall) and the Fisher's combined p-values (pglobal). p-values <1e-16 are rounded to 0 when using the pchisq function. Some other p-values in the dNdScv package rely on the negative binomial distribution, using the pnbinom function, which can output values as low as around 1e-300. In general, you can choose to report p-values and q-values of 0 in the dNdScv package as being <1e-16.

You may also be interested in these two links related to this question: https://stats.stackexchange.com/a/11814 https://stackoverflow.com/q/40144267

Best, Inigo