Confidence interval for Cramer's V

easystats / effectsize

:dragon: Compute and work with indices of effect size and standardized parameters

https://easystats.github.io/effectsize/

Other

337 stars 24 forks source link

Confidence interval for Cramer's V #620

Closed Daiki-Nakamura-git closed 11 months ago

Daiki-Nakamura-git commented 11 months ago

Hello, thank you for your continued development of effectsize package. I have a question about how to calculate confidence intervals for Cramer's V.

It is my understanding that there is the following relationship between the population of Cramer's V and the parameter λ of the non-central chi-square distribution. Where df is the degrees of freedom, λ is the non-centrality parameter of the non-central chi-square distribution, r is the number of rows, c is the number of columns, and N is the sample size.

Using this relationship, the 95% confidence interval for Cramer's V should be obtained as follows

library(effectsize)
data("Music_preferences")

res <- chisq.test(Music_preferences, correct = F)
chi2 <- res$statistic[[1]]
df <- res$parameter[[1]]
N <- sum(Music_preferences)
r <- dim(Music_preferences)[1]
c <- dim(Music_preferences)[2]

library(MBESS)
#> Warning: package 'MBESS' was built under R version 4.3.2
lambda.ci <- conf.limits.nc.chisq(Chi.Square = chi2, df = df)
sqrt((df + lambda.ci$Lower.Limit) / (min((r-1), (c-1)) * N)) # upper
#> [1] 0.1945033
sqrt((df + lambda.ci$Upper.Limit) / (min((r-1), (c-1)) * N)) # lower
#> [1] 0.2890337

The results are consistent with other existing packages.

library(confintr)
confintr::ci_cramersv(Music_preferences)
#> 
#>  Two-sided 95% chi-squared confidence interval for the population
#>  Cramer's V
#> 
#> Sample estimate: 0.2402985 
#> Confidence interval:
#>      2.5%     97.5% 
#> 0.1945034 0.2890337

However, when using the effectsize package, the results do not match.

effectsize::cramers_v(Music_preferences, adjust = F, alternative = "two.sided")|> print(digits = 6)
#> Cramer's V |               95% CI
#> ---------------------------------
#> 0.240298   | [0.184943, 0.282689]

Why does this happen? Could you please tell me why?

mattansb commented 11 months ago

I'm not quite sure what you mean by "population of Cramer's V". The formula you gave is the what is often given as the expectation of the sampling distribution of V ($E[V]$): since $E[\chi^2_{non-central}] = df + \lambda$, and $\hat{V} = \sqrt{\frac{\hat{\chi^2}}{N\times min(n-1,r-1)}}$. The denominator being a constant, means that the expectation of the sampling distribution of V is given by the formula you gave, which is not equal to the true V (it is biased, see Bergsma & Wicher, 2013).

Anyway, this formulation seems odd to me, since a similar claim can be made for non-central F distribution as well, where $E[F_{non-central}] = \frac{df_2(df_1+\lambda)}{df_1(df_2-2)}$, yet formulas for converting upper and lower $\lambda$ -s to CI limits do not account for this. That is, using the same logic of the CI for V, the CI for $\eta^2_p$ should be:

$$ \frac{\frac{df_2(df_1+\lambda)}{df_1(df_2-2)}}{\frac{df_2(df_1+\lambda)}{df_1(df_2-2)} + N} $$

instead of $\frac{\lambda}{\lambda + N}$ (including in Smithson, 2003, where the "adjusted" formulation for V's CIs seems to come from).

Maybe @bwiernik has some thoughts on this?

mattansb commented 11 months ago

After some investigating (in the code and documentation of the DescTools::CramerV(), for which Smithson contributed the code for the CIs), it seems like the method you posted is called the "adjusted noncentral chisquare" method, whereas the method used is is the standard "noncentral chisquare" method:

data("Music_preferences", package = "effectsize")

Standard noncentral method:

DescTools::CramerV(Music_preferences, conf.level = 0.95)
#>  Cramer V    lwr.ci    upr.ci 
#> 0.2402985 0.1849434 0.2826865
effectsize::cramers_v(Music_preferences, adjust = F, alt = "two") |> print(digits = 6)
#> Cramer's V |               95% CI
#> ---------------------------------
#> 0.240298   | [0.184943, 0.282689]

"Adjusted" noncentral method:

DescTools::CramerV(Music_preferences, conf.level = 0.95, method = "ncchisqadj")
#>  Cramer V    lwr.ci    upr.ci 
#> 0.2402985 0.1945035 0.2890315
confintr::ci_cramersv(Music_preferences)
#> 
#>  Two-sided 95% chi-squared confidence interval for the population
#>  Cramer's V
#> 
#> Sample estimate: 0.2402985 
#> Confidence interval:
#>      2.5%     97.5% 
#> 0.1945034 0.2890337

^{Created on 2023-11-23 with reprex v2.0.2}

So that seems to answer @Daiki-Nakamura-git 's original question of "However, when using the effectsize package, the results do not match. Why does this happen?"

bwiernik commented 11 months ago

Looks like the adjusted method is correcting for the bias, we should add that as an option like we do other small sample bias corrections

Daiki-Nakamura-git commented 11 months ago

@mattansb @bwiernik Thank you for your kind and detailed reply. I now understand that there are multiple types of methods for calculating confidence intervals for Cramer's V. It is attractive for users to have a choice of different options. Please consider adding an option for the calculation method of confidence intervals.

mattansb commented 11 months ago

@bwiernik we already have a method for adjusting for small sample bias - also note that (1) this method only applies to the CI, and (2) it actually increases the magnitude of the estimate effect, not shrinks it.

This has the result of actually reducing the CIs coverage:

data("Music_preferences", package = "effectsize")

Music_preferences_pop <- (Music_preferences * 12092) |> 
  as.data.frame() |> 
  tidyr::uncount(weights = Freq)

Lets assume this is big enough to be concidered the population:

nrow(Music_preferences_pop)
#> [1] 10000084

(true_V <- effectsize::cramers_v(Music_preferences_pop[[1]], Music_preferences_pop[[2]], ci = NULL)[[1]])
#> [1] 0.2402979

# sample from the population
sample_xtab_from_pop <- function(N) {
  df <- dplyr::slice_sample(Music_preferences_pop, n = N)
  table(df[[1]], df[[2]])
}

# test if true value is in CI
test_ci <- function(bounds) {
  bounds[2] < true_V && true_V < bounds[3]
}

Lets simulate:

N <- 100

set.seed(123)
res_ncp <- replicate(599, {
  sample_xtab_from_pop(N) |> 
    DescTools::CramerV(conf.level = 0.90) |> 
    test_ci()
})

res_adj_ncp <- replicate(599, {
  sample_xtab_from_pop(N) |> 
    DescTools::CramerV(conf.level = 0.90, method = "ncchisqadj") |> 
    test_ci()
})

# Should be 0.9
mean(res_ncp) # 5% more
#> [1] 0.9532554
mean(res_adj_ncp) # 7% less
#> [1] 0.8330551

^{Created on 2023-11-29 with reprex v2.0.2}

I don't see the benefit of this method over the standard NCP method.

Daiki-Nakamura-git commented 11 months ago

I ran a Monte Carlo simulation and both methods show results close to 95%.

standard <- NULL  # empty object
adjusted <- NULL  # empty object
n <- 200   # sample size
r <- 3     # number of rows
c <- 3     # number of columns
prob.vec <- c(2/9,1/18,1/18,
              1/18,2/9,1/18,
              1/18,1/18,2/9)  # Observation probability based on Bergsma (2012, p.327). True value of V is 0.5, 
k <- 10000 # number of simulations

set.seed(123) 
for (i in 1:k) {
  dat <- matrix(rmultinom(n = 1,size = n, prob = prob.vec), 
                nrow = r, ncol = c, byrow = TRUE)  # random numbers following a multinomial distribution
  res <- chisq.test(dat, correct = F) # chi-square test
  chi2 <- res$statistic[[1]] # chi-square value
  df <- res$parameter[[1]]   # degree of freedom
  lambda.ci <- conf.limits.nc.chisq(Chi.Square = chi2, df = df) # λ based on non-central distribution
  standard.lower <- sqrt((lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  standard.upper <- sqrt((lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  adjusted.lower <- sqrt((df + lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  adjusted.upper <- sqrt((df + lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  standard[i] <- ifelse(standard.lower <= 0.5 && 0.5 <= standard.upper, 1, 0) # If true value is included 1, else 0 
  adjusted[i] <- ifelse(adjusted.lower <= 0.5 && 0.5 <= adjusted.upper, 1, 0) # If true value is included 1, else 0
}

# Percentage of 95% confidence intervals containing true values
mean(standard) # 0.9491
mean(adjusted) # 0.9487

It may be possible that changing the conditions would reveal which method is better.

mattansb commented 11 months ago

At larger V's they have similar coverage because the NCP is much larger than the df. Here is the same example, with V reduced to ~ 0.12:

standard <- NULL  # empty object
adjusted <- NULL  # empty object
n <- 200   # sample size
r <- 3     # number of rows
c <- 3     # number of columns
prob.vec <- c(1/9, 1/18,1/18,
              1/18,1/18,1/18,
              1/18,1/18,1/18)  # Observation probability based on Bergsma (2012, p.327). True value of V is 0.5, 
k <- 599 # number of simulations

(true <- effectsize::cramers_v(matrix(prob.vec, 3), adjust = FALSE, ci = NULL)[[1]])
#> [1] 0.1178511

set.seed(123) 
for (i in 1:k) {
  dat <- matrix(rmultinom(n = 1,size = n, prob = prob.vec), 
                nrow = r, ncol = c, byrow = TRUE)  # random numbers following a multinomial distribution
  res <- chisq.test(dat, correct = F) # chi-square test
  chi2 <- res$statistic[[1]] # chi-square value
  df <- res$parameter[[1]]   # degree of freedom
  lambda.ci <- MBESS::conf.limits.nc.chisq(Chi.Square = chi2, df = df) # λ based on non-central distribution
  standard.lower <- sqrt((lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  standard.upper <- sqrt((lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  adjusted.lower <- sqrt((df + lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  adjusted.upper <- sqrt((df + lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  standard[i] <- ifelse(standard.lower <= true && true <= standard.upper, 1, 0) # If true value is included 1, else 0 
  adjusted[i] <- ifelse(adjusted.lower <= true && true <= adjusted.upper, 1, 0) # If true value is included 1, else 0
}

# Percentage of 95% confidence intervals containing true values
mean(standard) 
#> [1] 0.9632721
mean(adjusted)
#> [1] 0.8681135

Daiki-Nakamura-git commented 11 months ago

I think your calculation of the true value is incorrect. I modified the code so that the true value of V can be any value.

standard <- NULL  # empty object
adjusted <- NULL  # empty object
n <- 3000   # sample size
r <- 3     # number of rows
c <- 3     # number of columns
v <- 0.1   # setting true value of V. 
prob.vec <- c(1/9 + 2*v/9, 1/9 - v/9, 1/9 - v/9,
              1/9 - v/9, 1/9 + 2*v/9, 1/9 - v/9,
              1/9 - v/9, 1/9 - v/9 ,1/9+2*v/9)  # Observation probability based on Bergsma (2012, p.327).
k <- 1000 # number of simulations

# True V (verification)
prob.mat <- matrix(prob.vec, nrow = r, ncol = c, byrow = TRUE) # probability matrix
prob.expect <- outer(apply(prob.mat, 1, sum), apply(prob.mat, 2, sum)) # expected probability
phi <- sqrt(sum((prob.mat - prob.expect)^2 / prob.expect)) # population effect size phi
rho.v <- sqrt(phi ^ 2 / min(r - 1, c - 1)) # population effect size v
rho.v

set.seed(123) 
for (i in 1:k) {
  dat <- matrix(rmultinom(n = 1,size = n, prob = prob.vec), 
                nrow = r, ncol = c, byrow = TRUE)  # random numbers following a multinomial distribution
  res <- chisq.test(dat, correct = F) # chi-square test
  chi2 <- res$statistic[[1]] # chi-square value
  df <- res$parameter[[1]]   # degree of freedom
  lambda.ci <- MBESS::conf.limits.nc.chisq(Chi.Square = chi2, df = df) # λ based on non-central distribution
  standard.lower <- sqrt((lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  standard.upper <- sqrt((lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  adjusted.lower <- sqrt((df + lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  adjusted.upper <- sqrt((df + lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  standard[i] <- ifelse(standard.lower <= rho.v && rho.v <= standard.upper, 1, 0) # If true value is included 1, else 0 
  adjusted[i] <- ifelse(adjusted.lower <= rho.v && rho.v <= adjusted.upper, 1, 0) # If true value is included 1, else 0
}

# Percentage of 95% confidence intervals containing true values
mean(standard) 
mean(adjusted)

mattansb commented 11 months ago

Yes, my calculation was off of V was off.

Anyway you've now increased N by 15, which reduces estimation bias. All these biases are largest (proportionally) in smaller samples with smaller effect sizes. So bringing N back to 200:

standard <- NULL  # empty object
adjusted <- NULL  # empty object
n <- 3000   # sample size
r <- 3     # number of rows
c <- 3     # number of columns
v <- 0.1   # setting true value of V. 
prob.vec <- c(1/9 + 2*v/9, 1/9 - v/9, 1/9 - v/9,
              1/9 - v/9, 1/9 + 2*v/9, 1/9 - v/9,
              1/9 - v/9, 1/9 - v/9 ,1/9+2*v/9)  # Observation probability based on Bergsma (2012, p.327).
k <- 1000 # number of simulations

# True V (verification)
prob.mat <- matrix(prob.vec, nrow = r, ncol = c, byrow = TRUE) # probability matrix
prob.expect <- outer(apply(prob.mat, 1, sum), apply(prob.mat, 2, sum)) # expected probability
phi <- sqrt(sum((prob.mat - prob.expect)^2 / prob.expect)) # population effect size phi
rho.v <- sqrt(phi ^ 2 / min(r - 1, c - 1)) # population effect size v
rho.v
#> [1] 0.1

set.seed(123) 
for (i in 1:k) {
  dat <- matrix(rmultinom(n = 1,size = n, prob = prob.vec), 
                nrow = r, ncol = c, byrow = TRUE)  # random numbers following a multinomial distribution
  res <- chisq.test(dat, correct = F) # chi-square test
  chi2 <- res$statistic[[1]] # chi-square value
  df <- res$parameter[[1]]   # degree of freedom
  lambda.ci <- MBESS::conf.limits.nc.chisq(Chi.Square = chi2, df = df) # λ based on non-central distribution
  standard.lower <- sqrt((lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  standard.upper <- sqrt((lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  adjusted.lower <- sqrt((df + lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  adjusted.upper <- sqrt((df + lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  standard[i] <- ifelse(standard.lower <= rho.v && rho.v <= standard.upper, 1, 0) # If true value is included 1, else 0 
  adjusted[i] <- ifelse(adjusted.lower <= rho.v && rho.v <= adjusted.upper, 1, 0) # If true value is included 1, else 0
}

# Percentage of 95% confidence intervals containing true values
mean(standard, na.rm = TRUE) 
#> [1] 0.9498495
mean(adjusted, na.rm = TRUE)
#> [1] 0.7683049

Differences are not due to number of missing-values:

mean(is.na(standard))
#> [1] 0.003
mean(is.na(adjusted))
#> [1] 0.003

I think we can consider this topic closed - I've shown mathematically it doesn't make sense and with simulations using my own code and yours that it produces bad coverage due to worsening the bias of the CI.

Daiki-Nakamura-git commented 11 months ago

To address the missing values in the estimation of non-central parameters when the effect size is small, I modified the code to correct the missing values to zero. The simulations gave me the same results as you, and I agree with you that the standard non-central method is superior. Thank you very much for your discussion.

standard <- NULL  # empty object
adjusted <- NULL  # empty object
n <- 200   # sample size
r <- 3     # number of rows
c <- 3     # number of columns
v <- 0.1   # setting true value of V. 
prob.vec <- c(1/9 + 2*v/9, 1/9 - v/9, 1/9 - v/9,
              1/9 - v/9, 1/9 + 2*v/9, 1/9 - v/9,
              1/9 - v/9, 1/9 - v/9 ,1/9+2*v/9)  # Observation probability based on Bergsma (2012, p.327).
k <- 10000 # number of simulations

# True V (verification)
prob.mat <- matrix(prob.vec, nrow = r, ncol = c, byrow = TRUE) # probability matrix
prob.expect <- outer(apply(prob.mat, 1, sum), apply(prob.mat, 2, sum)) # expected probability
phi <- sqrt(sum((prob.mat - prob.expect)^2 / prob.expect)) # population effect size phi
rho.v <- sqrt(phi ^ 2 / min(r - 1, c - 1)) # population effect size v
rho.v
#> [1] 0.1

set.seed(123) 
for (i in 1:k) {
  dat <- matrix(rmultinom(n = 1,size = n, prob = prob.vec), 
                nrow = r, ncol = c, byrow = TRUE)  # random numbers following a multinomial distribution
  res <- chisq.test(dat, correct = F) # chi-square test
  chi2 <- res$statistic[[1]] # chi-square value
  df <- res$parameter[[1]]   # degree of freedom
  lambda.ci <- MBESS::conf.limits.nc.chisq(Chi.Square = chi2, df = df) # λ based on non-central distribution
  standard.lower <- sqrt((lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  standard.lower <- ifelse(is.na(standard.lower), 0, standard.lower)
  standard.upper <- sqrt((lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by Standard noncentral method
  standard.upper <- ifelse(is.na(standard.upper), 0, standard.upper)
  adjusted.lower <- sqrt((df + lambda.ci$Lower.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  adjusted.lower <- ifelse(is.na(adjusted.lower), 0, adjusted.lower)
  adjusted.upper <- sqrt((df + lambda.ci$Upper.Limit) / (min(r - 1, c - 1) * n)) # 95% CI by "Adjusted" noncentral method
  adjusted.upper <- ifelse(is.na(adjusted.upper), 0, adjusted.upper)
  standard[i] <- ifelse(standard.lower <= rho.v && rho.v <= standard.upper, 1, 0) # If true value is included 1, else 0 
  adjusted[i] <- ifelse(adjusted.lower <= rho.v && rho.v <= adjusted.upper, 1, 0) # If true value is included 1, else 0
}

# Percentage of 95% confidence intervals containing true values
mean(standard) 
#> [1] 0.9472
mean(adjusted)
#> [1] 0.7728

mattansb commented 11 months ago

🙏