Bug in ordered multinomial models with missing categories

dmphillippo commented 10 months ago

In ordered multinomial models where studies do not observe all categories, data points could be assigned the wrong category in a couple of scenarios:

Where a study does not observe the first category; absolute estimates in this study would be wrong
Where a study does not observe a middle category; relative and absolute estimates in this study would be wrong

For example, consider the HTA psoriasis data with categories removed from some studies:

Code

```r library(multinma) library(dplyr) library(tidyr) library(ggplot2) options(mc.cores = parallel::detectCores()) # Remove some categories hta_dat <- hta_psoriasis %>% mutate(PASI50 = ifelse(studyc == "Elewski", NA, PASI50), PASI75 = ifelse(studyc == "Leonardi", NA, PASI75)) hta_net <- set_agd_arm(hta_dat, study = paste(studyc, year), trt = trtc, r = multi(r0 = sample_size - rowSums(cbind(PASI50, PASI75, PASI90), na.rm = TRUE), PASI50, PASI75, PASI90, inclusive = FALSE, type = "ordered")) hta_net hta_fit_FE <- nma(hta_net, trt_effects = "fixed", link = "probit", prior_intercept = normal(scale = 100), prior_trt = normal(scale = 10), prior_aux = flat()) hta_fit_FE predict(hta_fit_FE, baseline = distr(qnorm, mean = -1.097, sd = 123^-0.5), type = "response") # Plot predictions against observed data hta_pred_FE <- predict(hta_fit_FE, type = "response") hta_pred_mod <- hta_pred_FE %>% as_tibble() %>% mutate(method = "NMA", outcomef = recode_factor(gsub("^pred\\[.+: (.+), PASI(50|75|90)\\]$", "\\2", parameter), "50" = "PASI 50", "75" = "PASI 75", "90" = "PASI 90"), .trt = factor(gsub("^pred\\[.+: (.+?)(, PASI(50|75|90))?\\]$", "\\1", parameter), levels = levels(hta_net$treatments)), estimate = mean, conf.low = `2.5%`, conf.high = `97.5%`) # Get observed proportions get_binom_ci <- function(r, n) { if (is.na(r)) return(tibble(estimate = NA_real_, conf.low = NA_real_, conf.high = NA_real_)) bt <- binom.test(r, n) tibble(estimate = bt$estimate, conf.low = bt$conf.int[1], conf.high = bt$conf.int[2]) } hta_pred_obs <- hta_dat %>% mutate(studyc = paste(studyc, year)) %>% pivot_longer(matches("PASI(50|75|90)"), names_to = "outcome", values_to = "value", names_pattern = "(PASI(?:50|75|90))") %>% rowwise() %>% #group_by(studyc, trtc, outcome) %>% summarise(studyc, trtc, outcome, get_binom_ci(value, sample_size))%>% mutate(method = "Observed", outcomef = recode_factor(outcome, PASI50 = "PASI 50", PASI75 = "PASI 75", PASI90 = "PASI 90"), .trt = factor(trtc, levels = levels(hta_net$treatments)), .study = factor(studyc, levels = levels(hta_net$studies))) hta_pred_all <- bind_rows(hta_pred_mod, hta_pred_obs) %>% mutate(methodf = factor(method, levels = c("NMA", "Observed"))) # Plot ggplot(hta_pred_all, aes(x = estimate*100, xmin = conf.low*100, xmax = conf.high*100, y = forcats::fct_rev(.trt), shape = methodf, colour = methodf)) + geom_errorbar(width = 0, position = position_dodge(width = 0.75), size = 0.3) + geom_point(position = position_dodge(width = 0.75), fill = "white") + xlab("Percent achieving PASI outcome") + ylab("Treatment") + coord_cartesian(xlim = c(0, 100)) + facet_grid(.study~outcomef) + scale_shape_manual("Method", values = c(`NMA` = 16, Observed = 21)) + scale_colour_manual("Method", values = c(`NMA` = "#00468B", Observed = "#8B0000")) + theme_multinma() ```

The output plot with the current package version (0.5.1) with the issue looks like this, whereas the fixed version with the correct categories looks like this.

The issue is particularly noticeable in this example for studies omitting the first category (PASI 50).

dmphillippo commented 10 months ago

Fix merged into development branch, for next CRAN release.

dmphillippo commented 10 months ago

Now on CRAN, with the release of version 0.6.0

dmphillippo / multinma

Bug in ordered multinomial models with missing categories #28