imbi-heidelberg / DescrTab2

This package provides functions to create descriptive statistics tables for continuous and categorical variables.
https://imbi-heidelberg.github.io/DescrTab2/
9 stars 7 forks source link

Confidence intervals from farrington.manning #23

Open renlund opened 1 month ago

renlund commented 1 month ago

I'm running DescrTab2 (2.1.16), using the 'farrington.manning' function. I get the same confidence intervals for different deltas, which is not what I expect.

x1 <- rep(c(T,F), c(20,100))
x2 <- rep(c(T,F), c(30,100))
DescrTab2::farrington.manning(group1 = x1, group2 = x2, delta = 0,
                              alternative = "less", alpha = 0.025)$conf.int
# [1] -0.16278681  0.03590955
# attr(,"conf.level")
# [1] 0.95

and I get the same values if I change delta, e.g. to 0.05, 0.1, 0.2.

E.g. SAS will give slightly different confidence intervals depending on the delta parameter - and this seems to be appropriate considering the test. Perhaps this is intended, I just wanted to bring this to your attention.

jan-imbi commented 1 month ago

Hey,

thanks for reporting this. I think I know why this happens, but I am not sure if SAS does it the better way.

In the Farrington-Manning test, the variance for the test statistic is calculated from the maximum likelihood estimates of p_1 and p_2 under the restriction p_1^tilde - p_2^tilde = s_0 (see the original paper). Changing delta (which in their notation is s_0) changes the variance estimate. Since a confidence interval should give the values of the null hypothesis space for which the test does not reject the null hypothesis, I think it makes sense to use these values for calculating the variance estimate to be used in the test statistic.

For your example, that means that for the interval [-0.16278681, 0.03590955], the respective variance estimates were calculated under the respective assumptions s_0 = -0.16278681 and s_0 = 0.03590955. As you noted, this means that whatever you supply for delta is irrelevant for the calculation of the confidence interval.

What I think SAS does is calculate only a single variance estimate under the assumption s_0 = whatever you supplied, and then construct a confidence interval as z + upper_limit / variance = 97.5% normal quantile, z - lower_limit / variance = 2.5% normal quantile.

I amended the code to what I think SAS does, and for the test cases I tried it seems to match up pretty well.

Here is what i think SAS does (changes to the original version are commented):

farrington.manning2 <- function(
  group1,
  group2,
  delta = 0, # note that the null is formulates in terms of -delta!
  alternative = "greater",
  alpha = 0.025
) {
  # remove na's
  if (any(is.na(group1)) | any(is.na(group2))) {
    warning("There are missing values in either of the groups, will be ignored.")
    group1 <- group1[!is.na(group1)]
    group2 <- group2[!is.na(group2)]
  }

  # check for logical input vectors
  if (!is.logical(group1) | !is.logical(group2)) {
    stop("Inputs group1 and group2 must be logical vectors!")
  }

  # check alternative
  if (!(alternative %in% c("two.sided", "less", "greater"))) {
    stop("alternative must be either two.sided, less or greater")
  }

  # check alpha range
  if (0 >= alpha | alpha >= 1) {
    stop("alpha must be in (0, 1)")
  }

  # check delta
  if (delta >= 1) {
    stop("A rate difference of < 1 is not possible.")
  }
  if (delta <= -1) {
    stop("A rate difference of > 1 is not possible.")
  }

  # construct results object
  res <- list()
  class(res) <- "htest"
  res$null.value  <- c(delta)
  names(res$null.value) <- c("rate difference (group 1 - group 2)")
  res$alternative <- alternative
  str <- "Farrington-Manning test for rate differences"
  if (alternative == "greater" & delta < 0) {
    str <- "Non-inferiority test for rates according to Farrington-Manning"
  }
  if (alternative == "greater" & delta >= 0) {
    str <- "Superiorty test for rates according to Farrington-Manning"
  }
  res$method      <- "Farrington-Manning test for non-inferiority of rates"
  res$data.name   <- sprintf("group 1: %s, group 2: %s", deparse(substitute(group1)), deparse(substitute(group2)))
  res$parameters  <- c(delta)
  names(res$parameters) <- c("noninferiority margin")

  # number of samples per group
  n1 <- length(group1)
  n2 <- length(group2)
  res$sample.size <- n1 + n2

  # compute maximum likelihood estimates
  p1_ML <- mean(group1)
  p2_ML <- mean(group2)

  # rate difference
  diff_ML <- p1_ML - p2_ML
  res$estimate <- c(diff_ML)
  names(res$estimate) <- c("rate difference (group 1 - group 2)")

  # standard deviation of the rate difference under the null hypothesis (risk difference = -delta)
  get_sd_diff_ML_null <- function(delta) {
    theta           <- n2/n1
    d               <- -p1_ML*delta*(1 + delta)
    c               <- delta^2 + delta*(2*p1_ML + theta + 1) + p1_ML + theta*p2_ML
    b               <- -(1 + theta + p1_ML + theta*p2_ML + delta*(theta + 2))
    a               <- 1 + theta
    v               <- b^3/(27*a^3) - b*c/(6*a^2) + d/(2*a)
    u               <- sign(v)*sqrt(b^2/(9*a^2) - c/(3*a))
    w               <- (pi + acos(   max(min(1, v/u^3), 0, na.rm = TRUE)  ))/3
    p1_ML_null      <- 2*u*cos(w) - b/(3*a)
    p2_ML_null      <- p1_ML_null - delta
    sd_diff_ML_null <- sqrt(p1_ML_null*(1 - p1_ML_null)/n1 + p2_ML_null*(1 - p2_ML_null)/n2)
    return(sd_diff_ML_null)
  }
  sd_diff_ML_null <- get_sd_diff_ML_null(delta)

  sd_ <- get_sd_diff_ML_null(delta) # This line changed (1/2)

  # test statistic
  get_z <- function(delta) {
    z <- (diff_ML - delta)/sd_ # This line changed (2/2)
    return(z)
  }
  z <- get_z(delta)
  res$statistic <- z
  names(res$statistic) <- "Z-statistic"

  # p-value, probability of Z > < == z
  p_value_greater <- 1 - pnorm(z)
  p_value_less <- pnorm(z)
  p_value_two.sided <- 2*min(p_value_less, p_value_greater)
  if (alternative == "greater") {
    res$p.value <- p_value_greater
  }
  if (alternative == "less") {
    res$p.value <- p_value_less
  }
  if (alternative == "two.sided") {
    res$p.value <- p_value_two.sided
  }

  # confidence interval by inversion of two-sided test
  p_value_two.sided <- function(delta) {
    if (abs(get_sd_diff_ML_null(delta)) < .Machine$double.eps) {
      return(1)
    } else{
      z <- get_z(delta)
      p_value_greater <- 1 - pnorm(z)
      p_value_less <- pnorm(z)
      2 * min(p_value_less, p_value_greater)
    }
  }

  alpha_mod <- ifelse(alternative == "two.sided", alpha, 2*alpha)
  ci_lo <- tryCatch(
    uniroot(
      function(delta) p_value_two.sided(delta) - alpha_mod, interval = c(-1+1e-6, res$estimate), tol = 1e-12
    )$root,
    error = function(cond){
      message("Error converted to warning:", cond)
      NA_real_
    }
  )
  ci_hi <- tryCatch(
    uniroot(
      function(delta) p_value_two.sided(delta) - alpha_mod, interval = c(res$estimate, 1 - 1e-6), tol = 1e-12
    )$root,
    error = function(cond){
      message("Error converted to warning:", cond)
      NA_real_
    }
  )
  # confidence interval
  res$conf.int <- c(ci_lo, ci_hi)
  attr(res$conf.int, "conf.level") <- 1 - alpha_mod

  return(res)
}

Here are the R test cases:

x1 <- rep(c(T,F), c(20,100))
x2 <- rep(c(T,F), c(30,100))

> farrington.manning2(group1 = x1, group2 = x2, delta = -0.05,
+                                                              alternative = "less", alpha = 0.025)$conf.int
[1] -0.16265359  0.03444846
attr(,"conf.level")
[1] 0.95
> farrington.manning2(group1 = x1, group2 = x2, delta = -0.1,
+                                                              alternative = "less", alpha = 0.025)$conf.int
[1] -0.16246026  0.03425513
attr(,"conf.level")
[1] 0.95
> farrington.manning2(group1 = x1, group2 = x2, delta = -0.2,
+                                                              alternative = "less", alpha = 0.025)$conf.int
[1] -0.16312627  0.03492114
attr(,"conf.level")
[1] 0.95

Here are the SAS test cases:

ods html;
proc format;
   value ExpFmt 1='High Cholesterol Diet'
                0='Low Cholesterol Diet';
   value RspFmt 1='Yes'
                0='No';
run;
data FatComp;
   input Exposure Response Count;
   label Response='Heart Disease';
   datalines;
0 0  100
0 1  30
1 0  100
1 1  20
;
proc sort data=FatComp;
   by descending Exposure descending Response;
run;

proc freq data=FatComp order=data;
format Exposure ExpFmt. Response RspFmt.;
tables Exposure*Response / alpha=0.025 RISKDIFF (CL=(FM) NONINF MARGIN=0.05 METHOD=FM);
weight count;
run;

proc freq data=FatComp order=data;
format Exposure ExpFmt. Response RspFmt.;
tables Exposure*Response / alpha=0.025 RISKDIFF (CL=(FM) NONINF MARGIN=0.1 METHOD=FM);
weight count;
run;

proc freq data=FatComp order=data;
format Exposure ExpFmt. Response RspFmt.;
tables Exposure*Response / alpha=0.025 RISKDIFF (CL=(FM) NONINF MARGIN=0.2 METHOD=FM);
weight count;
run;

Hope that helps. Since I'm not yet convinced that the SAS-way is actually better, I'm gonna leave it as it is for now.