Closed spiralparagon closed 7 months ago
Thanks for the post. Can you please provide a minimal reproducible example calculating the pvalue?
Can you also link to a manuscript using the test in the way you're suggesting?
Hi Daniel! Thanks for replying!
I have the r-bloggers page as an example: https://www.r-bloggers.com/2014/06/example-2014-6-comparing-medians-and-the-wilcoxon-rank-sum-test/
But this question is based on a false premise: that the the Wilcoxon rank-sum test is used to compare medians. The premise is based on a misunderstanding of the null hypothesis of the test. The actual null hypothesis is that there is a 50% probability that a random value from one population exceeds an random value from the other population. The practical value of this is hard to see, and thus in many places, including textbooks, the null hypothesis is presented as “the two populations have equal medians”. The actual null hypothesis can be expressed as the latter median hypothesis, but only under the additional assumption that the shapes of the distributions are identical in each group.
In other words, our interpretation of the test as comparing the medians of the distributions requires the location-shift-only alternative to be the case. Since this is rarely true, and never assessed, we should probably use extreme caution in using, and especially in interpreting, the Wilcoxon rank-sum test.
I've paraphrased the code on to the following:
#Import libraries
library(tidyverse)
library(gtsummary)
library(quantreg)
#Create a reproducible dataset
set.seed(123)
y1 = as_tibble(rexp(10000))
y2 = as_tibble(rnorm(10000) + log(2))
y1$group = "Y1"
y2$group = "Y2"
df = rbind(y1,y2)
#Run wilcox.test manually
wilcox.test(value ~ group, data=df)
#Show using gtsummary
df %>%
tbl_summary(
by = group,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})",
all_categorical() ~ "{n} / {N} ({p}%)"),
digits = list(all_continuous() ~ c(1,1)),
) %>%
bold_labels() %>%
add_p(pvalue_fun = ~style_pvalue(.x, digits = 2))
#Run a quantile regression, for the median (tau=0.5)
qr <- rq(value ~ group, data=df, tau = 0.5)
qr
#Calculate standard errors using bootstrapping with 100 replications, and respective p-values
set.seed(123)
summary(qr, se="boot", R=100)
Maybe I'm misunderstanding the syntax for add_p? For example, when I attempt to replace wilcoxons test with kolmorov-smirnov, which is called by "ks.test" with identical syntax, the p-value is omitted in the output
df %>%
tbl_summary(
by = group,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})",
all_categorical() ~ "{n} / {N} ({p}%)"),
digits = list(all_continuous() ~ c(1,1)),
) %>%
bold_labels() %>%
add_p(pvalue_fun = ~style_pvalue(.x, digits = 2),
test = list(all_continuous() ~ "ks.test"))
Similarily, I don't see where you could add parameters to the test-call for example, which would be needed in the case for quantreg (specifying the bootstrap estimator and number of replications)
Thanks @spiralparagon for the additional context!
Frist, regarding adding a quantile regression test: unfortunately, I don't think we will add it natively in the package as it depends on the {quantreg} package and I don't want to take on more dependencies than we already have. But that doesn't mean you can't include these results in your gtsummary tables. You can include any p-value by creating a custom function: https://www.danieldsjoberg.com/gtsummary/reference/tests.html#custom-functions-1
here's an example:
library(gtsummary)
#> #BlackLivesMatter
add_p_median_reg <- function(data, variable, by, ...) {
# build regression model
quantreg::rq(
reformulate(variable, response = by),
data = data |>
tidyr::drop_na(all_of(c(variable, by))) |>
dplyr::mutate("{by}" := factor(.data[[by]]) |> as.numeric()),
tau = 0.5
) |>
# use summary to get p-value
summary(se = "boot") |>
# format results like you'd see from `broom::tidy()`
purrr::pluck(coefficients) |>
as.data.frame() |>
tail(n = 1) |>
setNames(c("estimate", "std.error", "statistic", "p.value")) |>
dplyr::mutate(method = "Median Regression")
}
add_p_median_reg(mtcars, variable = "mpg", by = "am")
#> estimate std.error statistic p.value method
#> mpg 0.05649718 0.01887302 2.993542 0.005478198 Median Regression
add_p_median_reg(trial, variable = "age", by = "trt")
#> estimate std.error statistic p.value method
#> age 0 0.007789238 0 1 Median Regression
tbl <-
trial |>
tbl_summary(
by = trt,
missing = "no",
include = c(age, marker)
) |>
add_p(test = all_continuous() ~ add_p_median_reg)
Created on 2024-04-06 with reprex v2.1.0
Second, we have not yet included support for the ks.test
(no one has requested it before. Since it's included in the base R, we can include it. To keep things clean, if you'd like, can you request support in a separate issue?
Hi Daniel!
Thank you so much for the help. I understand not wanting to increase dependencies and I appreciate your thorough example on how to implement this despite the fact.
Strangely, I don't see the expected behaviour with my previous code sample - the output of add_p_median_reg is correct on all accounts except for the p-value where it gives a 0.
library(tidyverse)
library(gtsummary)
library(quantreg)
#Create a reproducible dataset
set.seed(123)
y1 = as_tibble(rexp(10000))
y2 = as_tibble(rnorm(10000) + log(2))
y1$group = "Y1"
y2$group = "Y2"
df = rbind(y1,y2)
qr <- rq(value ~ group, data=df, tau = 0.5)
set.seed(123)
summary(qr, se="boot", R=100)
add_p_median_reg <- function(data, variable, by, ...) {
# build regression model
quantreg::rq(
reformulate(variable, response = by),
data = data |>
tidyr::drop_na(all_of(c(variable, by))) |>
dplyr::mutate("{by}" := factor(.data[[by]]) |> as.numeric()),
tau = 0.5
) |>
# use summary to get p-value, added 200 reps
summary(se = "boot", R=200) |>
# format results like you'd see from `broom::tidy()`
purrr::pluck(coefficients) |>
as.data.frame() |>
tail(n = 1) |>
setNames(c("estimate", "std.error", "statistic", "p.value")) |>
dplyr::mutate(method = "Median Regression")
}
set.seed(123)
add_p_median_reg(df, variable = "value", by = "group")
I can't see what's wrong in the code - even running a subset of the function directly gives expected result
set.seed(123)
summary(qr, se = "boot", R=200) |>
# format results like you'd see from `broom::tidy()`
purrr::pluck(coefficients) |>
as.data.frame() |>
tail(n = 1) |>
setNames(c("estimate", "std.error", "statistic", "p.value"))
I'm comparing the output of age to trt in the trial dataset
qr <- rq(age ~ trt, data=trial, tau = 0.5)
set.seed(123)
summary(qr, se="boot", R=100)
vs the function output
I'll create a seperate thread for ks.test. Thank you again!
After exploring this a bit looking at the model, the problem seems to be in reformulate not giving expected behaviour to the rq function.
Attempted to use paste instead, aided with the magic of chatgpt (which also had an issue with the factor to numeric conversion, not sure how accurate it is). Caveat emptor as I'm not experienced :(
add_p_median_reg <- function(data, variable, by, ...) {
# build regression model
data <- data |>
tidyr::drop_na(all_of(c(variable, by))) |>
dplyr::mutate(!!sym(by) := factor(.data[[by]]))
# Construct the formula directly
formula_str <- paste(variable, "~", by)
formula <- as.formula(formula_str)
# Build regression model using the constructed formula
quantreg::rq(
formula,
data = data,
tau = 0.5
) |>
# use summary to get p-value
summary(se = "boot", R=200) |>
# format results like you'd see from `broom::tidy()`
purrr::pluck(coefficients) |>
as.data.frame() |>
tail(n = 1) |>
setNames(c("estimate", "std.error", "statistic", "p.value")) |>
dplyr::mutate(method = "Median Regression")
}
seems to give expected behaviour
Ahh, i had flipped the dependent and independent variables in the reformulate 😱 glad you have a solution
Is your feature request related to a problem? Please describe. Quantile regression is a better solution to test if a quantile (such as the median) is different between two groups, vs the default Wilcoxon-rank test (which assumes equal distributions). the quantreg package has the rq command which syntactically looks identical to wilcoxon.rank, but for some reason add_p with ~ rq, and variations thereof doesn't seem to work. I can't seem to find anyone that has implemented it successfully code-wise.
Describe the solution you'd like A way to use quantile regression for p-values for statistics testing the median. I'd like to be able to specify the estimator for standard errors and n. replications. Usually the rq model is fed into a summary such as summary(model, se="boot", R=NN) to get bootstrapped SEs and p-values.