Open alapo opened 2 years ago
Hi. I had a similar problem using the function rstatix::cor_test
to correlate gene expression of one gene of interest with gene expression of all other genes of an expression matrix. However some p-values are round to 0 .
Is there an option to retain the original very small p-value?
Thanks for your reply!
Best regards
# example:
library(ALL)
data(ALL)
exp.all <- exprs(ALL) %>% t() %>% as.data.frame()
# which genes correlate significantly with one gene of interest?
library(rstatix)
cortestx <- exp.all %>%
cor_test(
vars = colnames(exp.all)[1],
vars2 = colnames(exp.all)[2:ncol(exp.all)],
method = "spearman") %>%
adjust_pvalue(., method = "BH") %>%
add_significance()
# one bug of cor_test: some p-values are exactly 0
cortestx <- cortestx[order(cortestx$p.adj, decreasing = F),]
cortestx[1:5,]
I also have this problem. The rstatix::add_xy_position()
function, which can be used to automatically position p values on a ggplot when used with ggpubr::stat_pvalue_manual()
, depends on rstatix::get_summary_stats()
. When the y values are small, incorrect rounding results in positioning the p value at y = 0. This problem was posted on stackoverflow, where the advice from the community there was to work around the issue by multiplying the y values by a large value before using rstatix
, and then dividing them to convert them back to their original scale.
Here is the MRE posted at stackoverflow to illustrate the problem here.
# import libraries
library(tidyverse)
library(rstatix)
library(ggpubr)
# minimal reproducible data example
data <- tibble::tribble(
~Group, ~Response,
"Treatment", 3.210486e-06,
"Control", 4.006825e-06,
"Treatment", 4.350836e-06,
"Control", 4.216934e-06,
"Treatment", 4.415194e-06,
"Control", 1.0260606e-05,
"Treatment", 1.111064e-06,
"Control", 1.0779088e-05,
"Treatment", 3.57185e-07,
"Control", 1.139097e-06,
"Treatment", 0,
"Control", 2.31074e-07,
"Treatment", 5.78956e-07,
"Control", 4.371157e-06,
"Treatment", 6.5825e-08,
"Control", 9.587202e-06,
"Treatment", 2.65383e-07,
"Control", 3.57337e-06,
"Treatment", 7.14146e-07,
"Control", 3.868605e-06,
"Treatment", 1.2213951e-05,
"Control", 6.936899e-06,
"Treatment", 4.71707e-07,
"Control", 5.5957173e-05,
"Treatment", 1.265942e-06,
"Control", 8.5563441e-05,
"Treatment", 0,
"Control", 0,
"Treatment", 0,
"Control", 2.1306289e-05,
"Treatment", 5.2055e-07,
"Control", 1.8420094e-05
)
# performing Wilcoxon test and adding 'y.position' for the p value
wilcoxon_result <- data %>%
wilcox_test(Response ~ Group) %>%
add_xy_position("Group")
# the 'y.position' is incorrectly defined as zero as a result of rounding error.
wilcoxon_result
#> # A tibble: 1 × 11
#> .y. group1 group2 n1 n2 statistic p y.position groups xmin
#> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <name> <dbl>
#> 1 Response Control Treatm… 16 16 206. 0.00326 0 <chr> 1
#> # … with 1 more variable: xmax <dbl>
# the p value and bracket geom are plotted at zero due to the rounding issue.
ggplot(data, aes(Group, Response)) +
stat_summary(geom = "errorbar", fun.data = mean_cl_boot, width = 0.25) +
stat_summary(geom = "crossbar", fun = mean, fun.min = mean, fun.max = mean, fatten = FALSE, width = 0.5) +
stat_pvalue_manual(wilcoxon_result, "P = {p}", size = 18 / ggplot2::.pt, tip.length = 0, bracket.size = 2 / ggplot2::.stroke) +
theme_pubr(base_size = 18, x.text.angle = 45)
I have been successfully using rstatix to create demographics tables for some time now. Today I ran into an issue when using small numbers as inputs. The function
get_summary_stats
will round the output and give all zeros. Below is a reprex, is there a way to force the function not to round the output?