Closed barrel0luck closed 4 years ago
This is not a real problem. The obtained output is expected by design. As explained, in the details section of the function, for a grouped data, the p-value adjustment is performed for each group level independently.
In your example, there is only one comparaison by label. So, the p-value remains unchanged no matter whether you specify or not the option p.adjust.method
.
In the following example, we have 3 comparisons per supp
levels, so the multiple testing correction is applied.
library(rstatix)
# Data
df <- ToothGrowth
df$dose <- factor(Tdf$dose)
#> Error in factor(Tdf$dose): objet 'Tdf' introuvable
# Test 1
df %>%
group_by(supp) %>%
wilcox_test(len ~ dose)
#> # A tibble: 6 x 10
#> supp .y. group1 group2 n1 n2 statistic p p.adj
#> * <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 OJ len 0.5 1 10 10 7.5 1.00e-3 3.00e-3
#> 2 OJ len 0.5 2 10 10 0 1.81e-4 5.43e-4
#> 3 OJ len 1 2 10 10 26.5 8.20e-2 8.20e-2
#> 4 VC len 0.5 1 10 10 0 1.80e-4 5.40e-4
#> 5 VC len 0.5 2 10 10 0 1.82e-4 5.40e-4
#> 6 VC len 1 2 10 10 3 4.35e-4 5.40e-4
#> # … with 1 more variable: p.adj.signif <chr>
# Test 2
df %>%
group_by(supp) %>%
wilcox_test(len ~ dose, p.adjust.method = "none")
#> # A tibble: 6 x 10
#> supp .y. group1 group2 n1 n2 statistic p p.adj
#> * <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
#> 1 OJ len 0.5 1 10 10 7.5 1.00e-3 1.00e-3
#> 2 OJ len 0.5 2 10 10 0 1.81e-4 1.81e-4
#> 3 OJ len 1 2 10 10 26.5 8.20e-2 8.20e-2
#> 4 VC len 0.5 1 10 10 0 1.80e-4 1.80e-4
#> 5 VC len 0.5 2 10 10 0 1.82e-4 1.82e-4
#> 6 VC len 1 2 10 10 3 4.35e-4 4.35e-4
#> # … with 1 more variable: p.adj.signif <chr>
Created on 2020-02-17 by the reprex package (v0.3.0)
This was also by design to have a fixed output format for pairwise comparisons, where p-values are supposed to be always adjusted.
From my point of view, it makes no sens to combine the p-values of two completely different tests and to finally adjust them together.
So, I would go as follow:
# Required package --------------------
library(ggpubr)
#> Le chargement a nécessité le package : ggplot2
#> Le chargement a nécessité le package : magrittr
library(rstatix) # rstatix version: 0.4.0
#>
#> Attachement du package : 'rstatix'
#> The following object is masked from 'package:stats':
#>
#> filter
# I'm on Fedora 31 with R updated and all packages updated, if that detail is relevant
# Generate dataset ------------------------
set.seed(24)
labels <- c("A", "B", "C", "D", "E", "F", "G")
test_dataset <- data.frame(label = NULL, value = NULL)
means_vector <- c(75, 125, 75, 55, 45, 35, 25)
for (i in 1:length(means_vector)) {
ran_nums <- rnorm(n = 20, mean = means_vector[i], sd = 50)
temp_data <- data.frame(label = labels[i], value = ran_nums)
test_dataset <- rbind(test_dataset, temp_data)
}
# Tests -------------------
test_dataset_pvalues_1 <- test_dataset %>%
group_by(label) %>%
wilcox_test(value ~ 1, mu = 100) %>%
adjust_pvalue(method = "BH")
# Then I need to make some pairwise comparisons
comparisons_list <- list(c("A", "B"), c("A", "D"), c("A", "F"))
test_dataset_pvalues_2 <- test_dataset %>%
wilcox_test(value ~ label, comparisons = comparisons_list, p.adjust.method = "BH")
# Plot the data -------------------------------
myplot <- ggplot(test_dataset, aes(x = label, y = value)) +
geom_boxplot() +
geom_hline(yintercept = 100, linetype = 2, colour = "red")
# 1) Plot + One-sample test result
test_dataset_pvalues_1 <- test_dataset_pvalues_1 %>%
add_xy_position(x = "label") %>%
p_round(p.adj, digits = 2)
myplot +
stat_pvalue_manual(test_dataset_pvalues_1, label = "p.adj",
remove.bracket = TRUE, hjust = 1)
# 2) Plot + Two-sample test result
test_dataset_pvalues_2 <- test_dataset_pvalues_2 %>%
add_xy_position(x = "label")
myplot +
stat_pvalue_manual(test_dataset_pvalues_2, label = "p.adj", tip.length = 0)
Created on 2020-02-18 by the reprex package (v0.3.0)
Ok, this is the murkier part of multiple testing correction for me. I always multiple testing correct within an experiment and use only one kind of test within an experiment, i.e. a t_test (or wilcox_text). As I understand multiple testing correction, if you have a result that is barely significant at alpha = 0.05 then you expect 5% of all your tests performed to be falsely significant. So to overcome this issue, by performing multiple testing correction you're reducing the alpha threshold in effect (in some manner that depends on the numbers of tests performed) by adjusting the p value to make it higher. So, in this box plot if I'm plotting values that are generated to a reference label (not in the data as it's always at 100) then I need to:
The x positions calculated seem to be incorrect when making custom comparisons and not starting with the first label.
# Required package --------------------
library(ggpubr)
library(rstatix)
# Generate dataset ------------------------
set.seed(24)
labels <- c("A", "B", "C", "D", "E", "F", "G")
test_dataset <- data.frame(label = NULL, value = NULL)
means_vector <- c(75, 125, 75, 55, 45, 35, 25)
for (i in 1:length(means_vector)) {
ran_nums <- rnorm(n = 20, mean = means_vector[i], sd = 50)
temp_data <- data.frame(label = labels[i], value = ran_nums)
test_dataset <- rbind(test_dataset, temp_data)
}
comparisons_list <- list(c("D", "B"), c("C", "D"), c("B", "C"))
test_dataset_pvalues <- test_dataset %>%
wilcox_test(value ~ label, comparisons = comparisons_list) %>%
add_xy_position(x = "label")
ggplot(test_dataset, aes(x = label, y = value)) +
geom_boxplot() +
geom_hline(yintercept = 100, linetype = 2, colour = "red") +
stat_pvalue_manual(test_dataset_pvalues, label = "p.adj")
So here, even if the comparisons are D vs B, C vs D and B vs C the brackets are drawn from A onwards. Of note, stat_compare_means perfectly manages to add the brackets to the correct position and I could possibly use that, but I don't think it corrects for multiple testing. Is that correct?
Thank you for reporting this issue, I will fix it as soon as possible
fixed now, thanks!
Thanks for the fix! Also, can stat_compare_means perform multiple testing?
stat_compare_means if from the ggpubr package. It doesn't handle multiple testing correction. I'll work on this; Related issue: https://github.com/kassambara/ggpubr/issues/119
Hi again, I'm trying to use your package for stuff I normally do in R. And so as I face difficulties, I'm posting here. If I'm doing something obviously wrong apologies again. I've three issues this time:
Required package --------------------
library(ggpubr) library(rstatix) # rstatix version: 0.4.0
I'm on Fedora 31 with R updated and all packages updated, if that detail is relevant
Generate dataset ------------------------
set.seed(24) labels <- c("A", "B", "C", "D", "E", "F", "G")
test_dataset <- data.frame(label = NULL, value = NULL) means_vector <- c(75, 125, 75, 55, 45, 35, 25) for (i in 1:length(means_vector)) { ran_nums <- rnorm(n = 20, mean = means_vector[i], sd = 50) temp_data <- data.frame(label = labels[i], value = ran_nums) test_dataset <- rbind(test_dataset, temp_data) }
Perform the test ------------------------
PROBLEM 1
These two give the same results, eventhough by default p.adjust.method = "holm"
(When I press F1 on wilcox_test this is what is shown regarding p.adjust.method:
wilcox_test(data, formula, comparisons = NULL, ref.group = NULL,
p.adjust.method = "holm", paired = FALSE, exact = NULL,
alternative = "two.sided", mu = 0, conf.level = 0.95,
detailed = FALSE)
)
test_dataset_pvalues_1 <- test_dataset %>% group_by(label) %>% wilcox_test(value ~ 1, mu = 100)
test_dataset_pvalues_2 <- test_dataset %>% group_by(label) %>% wilcox_test(value ~ 1, mu = 100, p.adjust.method = "none")
test_dataset_pvalues_1 == test_dataset_pvalues_2 # shows both are identical
These two give same results
test_dataset_pvalues_1 <- test_dataset %>% group_by(label) %>% t_test(value ~ 1, mu = 100)
test_dataset_pvalues_2 <- test_dataset %>% group_by(label) %>% t_test(value ~ 1, mu = 100, p.adjust.method = "none")
test_dataset_pvalues_1 == test_dataset_pvalues_2 # shows both are identical
These two give same results
test_dataset_pvalues_1 <- test_dataset %>% group_by(label) %>% t_test(value ~ 1, mu = 100, p.adjust.method = "holm")
test_dataset_pvalues_2 <- test_dataset %>% group_by(label) %>% t_test(value ~ 1, mu = 100, p.adjust.method = "none")
test_dataset_pvalues_1 == test_dataset_pvalues_2 # shows both are identical
These two give same results
test_dataset_pvalues_1 <- test_dataset %>% group_by(label) %>% t_test(value ~ 1, mu = 100, p.adjust.method = "holm")
test_dataset_pvalues_2 <- test_dataset %>% group_by(label) %>% t_test(value ~ 1, mu = 100, p.adjust.method = "BH")
test_dataset_pvalues_1 == test_dataset_pvalues_2 # shows both are identical
So basically the p.adjust.method doesn't seem to be functioning
PROBLEM 2
In pairwise_t_test when asked for not adjusting p values, this setting is respected, but p.adj columns are returned anyway (with non adjusted pvalues)
If this is intentional behaviour - that's okay (although it could be misleading) - but this behaviour seems to be different from what you have going for the t_test function where these extra p.adj columns are not given out
test_dataset_pvalues_1 <- test_dataset %>%
group_by(label) %>% # Why does uncommenting this line give an error?
pairwise_t_test(value~label)
test_dataset_pvalues_2 <- test_dataset %>%
group_by(label) %>% # Why does uncommenting this line give an error?
pairwise_t_test(value~label, p.adjust.method = "holm")
test_dataset_pvalues_1 == test_dataset_pvalues_2 # shows both are identical AS IT SHOULD IN THIS CASE
However, when setting p.adjust.method = "none" although it works, why are the p.adj and p.adj.signif columns returned?
These columns have the same p values as the non adjusted one - so the correction has not taken place as asked, but the columns are returned anyway when they should not be right?
test_dataset_pvalues_3 <- test_dataset %>%
group_by(label) %>% # Why does uncommenting this line give an error?
pairwise_t_test(value~label, p.adjust.method = "none")
PROBLEM 3
How do I combine results of tests for performing multiple testing correction and plotting in same plot?
For example, in my random dataset generated here, I need to perform first comparisons with a mu = 100
test_dataset_pvalues_1 <- test_dataset %>% group_by(label) %>% wilcox_test(value ~ 1, mu = 100)
Then I need to make some pairwise comparisons
comparisons_list <- list(c("A", "B"), c("A", "D"), c("A", "F")) test_dataset_pvalues_2 <- test_dataset %>%
group_by(label) %>%
wilcox_test(value ~ label, comparisons = comparisons_list, p.adjust.method = "none")
Or I need to make all pairwise comparisons
test_dataset_pvalues_2 <- test_dataset %>%
group_by(label) %>%
pairwise_wilcox_test(value ~ label, p.adjust.method = "none")
So now I need to report both test_dataset_pvalues_1 and 2 in my plot
But first I need to combine them and perform multiple testing correction across all tests
Currently I can only perform multiple testing correction within the set of tests performed, but often you need to perform multiple testing correction across all tests performed right?
There's no easy way to combine test_dataset_pvalues_1 and 2 because of two reasons:
The extra p.adj columns even with multiple testing is not performed
And the n1 n2 columns issue which would be absent in the tests performed with mu = 100 (table 1)
So how would you recommend this complex testing paradigm be achieved in a pipe?
Having one common table will also aid their addition to the plot
Plot the data -------------------------------
myplot <- ggplot(test_dataset, aes(x = label, y = value)) + geom_boxplot() + geom_hline(yintercept = 100, linetype = 2, colour = "red")
Add p-values on the plot -----------------------
Right now I've just added to two tables without pvalue adjustment for this example.
I'd like to combine the two tables, adjust pvalues and then plot them together.
test_dataset_pvalues_1 <- test_dataset_pvalues_1 %>% add_xy_position(x = "label")
myplot + stat_pvalue_manual(test_dataset_pvalues_1, label = "p", remove.bracket = TRUE, hjust = 1) + stat_pvalue_manual(test_dataset_pvalues_2, label = "p", remove.bracket = FALSE, y.position = 200) # How to auto get y.position here?