Closed MattCloward closed 3 years ago
Hi @MattCloward: Thank you for your question.
This rounding has to do with the representation of numbers in R. See ?.Machine
, particularly .Machine$double.xmin
.
Now, gwasrapidd does provide you access to these p-values if you inspect the mantissa and the exponent columns, see example below. In these cases with such low p-values, you should perhaps ask yourself if you shouldn't floor the number to zero anyways... p-values of that order are, generally, hardly meaningful: for a comparison the estimated number of atoms in the universe is of the order of 10^78 to 10^82 (https://www.universetoday.com/36302/atoms-in-the-universe/amp/), so the probability of picking one specific atom at random is 10^{-82}...
If you really want to deal with all these p-values, even those that are astronomically low, in R, then you can perhaps work with the log of the p-value instead, which lies in a range amenable to R's representation of numbers.
Otherwise, the only alternative, is to use a dedicated package that provides arbitrary-precision numbers, e.g., Rmpfr.
library(gwasrapidd)
library(dplyr, include.only = 'mutate')
foo <- get_associations(study_id = "GCST001884")
foo@associations[c('pvalue', 'pvalue_mantissa', 'pvalue_exponent')] %>%
mutate(log_pvalue = log(pvalue_mantissa) + pvalue_exponent)
#> # A tibble: 19 x 4
#> pvalue pvalue_mantissa pvalue_exponent log_pvalue
#> <dbl> <int> <int> <dbl>
#> 1 7.00e-16 7 -16 -14.1
#> 2 9.00e-16 9 -16 -13.8
#> 3 3.00e-11 3 -11 -9.90
#> 4 7.00e-11 7 -11 -9.05
#> 5 2.00e- 8 2 -8 -7.31
#> 6 2.00e-11 2 -11 -10.3
#> 7 2.00e-11 2 -11 -10.3
#> 8 9.00e-11 9 -11 -8.80
#> 9 5.00e- 9 5 -9 -7.39
#> 10 2.00e- 8 2 -8 -7.31
#> 11 0. 4 -540 -539.
#> 12 0. 1 -434 -434
#> 13 4.00e-89 4 -89 -87.6
#> 14 1.00e-41 1 -41 -41
#> 15 2.00e-26 2 -26 -25.3
#> 16 2.00e-20 2 -20 -19.3
#> 17 3.00e-15 3 -15 -13.9
#> 18 4.00e-13 4 -13 -11.6
#> 19 3.00e-11 3 -11 -9.90
Thank you very much for this detailed explanation, including work arounds and especially reasons not to do any work arounds! The p values I have been looking at are waaay more than astronomically low (4x10^22 stars in the universe) https://robertkaplinsky.com/work/stars-in-the-universe/
Thanks for your help!
A few of the studies in the GWAS catalog have p values that are smaller than 1e-300 and can even be as small as 1e-500 or 1e-600. When I use the "get_associations" function, these p values are rounded to 0.
Ex: for the study "GCST001884", rs10737680 and rs10490924 have p values of 1x10-434 and 4x10-89 respectively. (Source: https://www.ebi.ac.uk/gwas/studies/GCST001884)
However, when I run the following command:
get_associations(study_id = "GCST001884")@associations[["pvalue"]]
These two SNPs have p values of 0e+00.Is there anything that can be done to fix this?