Closed victor-moreno closed 3 years ago
Hi, I discovered today that vcd package used to calculate log-odds-ratios and from them the OR provide wrong results when there is a zero in the 2x2 table. It should be zero or Inf, but the result is a variable number near to zero or large, but not the correct answer.
df <- data.frame(
A = factor(c(0,0,1,1)),
B = factor(c(0,1,0,1)),
count = c(34,76,12,0) )
data <- df[rep(seq_len(nrow(df)), df$count), 1:2]
jmv::contTables(data=data, formula= ~ A:B, logOdds = TRUE, odds = TRUE)
──────────────────────────────
A 0 1 Total
──────────────────────────────
0 34 76 110
1 12 0 12
Total 46 76 122
──────────────────────────────
Value Lower Upper
────────────────────────────────────────────────────────────
Log odds ratio -4.015207 -6.870342 -1.160072
Odds ratio 0.01803922 0.001038122 0.3134635
────────────────────────────────────────────────────────────
I propose to change the code to use fisher.test function, that doesn't require any additional library and provides an exact confidence interval. From that we can calculate the log-OR.
If you agree, I can prepare the changes and profit this PR
So it looks like vcd applies a continuity correction when there are zero cells:
correct: logical or numeric. Should a continuity correction be applied before computing odds? If TRUE, 0.5 is added to all cells; if numeric (or an array conforming to the data) that value is added to all cells. By default, this not employed unless there are any zero cells in the table, but this correction is often recommended to reduce bias when some frequencies are small (Fleiss, 1981).
Wouldn't this be an adequate solution to the problem of zero count cells?
I have been reading about that and the best answer is not an easy decision. If you need a standard error of the OR or logOR for meta-analysis, then adding 0.5 is required. However, to get only the point estimate and 95% CI the exact method can be used. This paper recommends it whenever possible: https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/j.2041-210x.2012.00250.x
I suppose we can leave the code as it is. Perhaps a footnote could be added when there is a zero in the table to warn that the correction was applied.
I have added the footnote. The text is 'Haldane-Ascombe correction applied'.
I found the reference to the original suggestion of adding 0.5 in the prior to the last paragraph sentence here https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/j.2041-210x.2012.00250.x I think adding the reference is not needed. If you paste Haldane-Ascombe correction in google you get this reference on second position.
Also, I have adde a condition to avoid computing estimators in 2x2 table if table in fact is 2x1 or 1x2 because a row or column is all zeros. Since 0.5 is added to each cell, now you get an artificial result for OR and logOR though the chi-square test is NaN.
Done
This contains only the Mantel-Haenszel trend test for conttables Victor