dmphillippo / multinma

Network meta-analysis of individual and aggregate data in Stan
https://dmphillippo.github.io/multinma
33 stars 15 forks source link

Bug in ordered multinomial models with missing categories #28

Closed dmphillippo closed 7 months ago

dmphillippo commented 8 months ago

In ordered multinomial models where studies do not observe all categories, data points could be assigned the wrong category in a couple of scenarios:

For example, consider the HTA psoriasis data with categories removed from some studies:

Code ```r library(multinma) library(dplyr) library(tidyr) library(ggplot2) options(mc.cores = parallel::detectCores()) # Remove some categories hta_dat <- hta_psoriasis %>% mutate(PASI50 = ifelse(studyc == "Elewski", NA, PASI50), PASI75 = ifelse(studyc == "Leonardi", NA, PASI75)) hta_net <- set_agd_arm(hta_dat, study = paste(studyc, year), trt = trtc, r = multi(r0 = sample_size - rowSums(cbind(PASI50, PASI75, PASI90), na.rm = TRUE), PASI50, PASI75, PASI90, inclusive = FALSE, type = "ordered")) hta_net hta_fit_FE <- nma(hta_net, trt_effects = "fixed", link = "probit", prior_intercept = normal(scale = 100), prior_trt = normal(scale = 10), prior_aux = flat()) hta_fit_FE predict(hta_fit_FE, baseline = distr(qnorm, mean = -1.097, sd = 123^-0.5), type = "response") # Plot predictions against observed data hta_pred_FE <- predict(hta_fit_FE, type = "response") hta_pred_mod <- hta_pred_FE %>% as_tibble() %>% mutate(method = "NMA", outcomef = recode_factor(gsub("^pred\\[.+: (.+), PASI(50|75|90)\\]$", "\\2", parameter), "50" = "PASI 50", "75" = "PASI 75", "90" = "PASI 90"), .trt = factor(gsub("^pred\\[.+: (.+?)(, PASI(50|75|90))?\\]$", "\\1", parameter), levels = levels(hta_net$treatments)), estimate = mean, conf.low = `2.5%`, conf.high = `97.5%`) # Get observed proportions get_binom_ci <- function(r, n) { if (is.na(r)) return(tibble(estimate = NA_real_, conf.low = NA_real_, conf.high = NA_real_)) bt <- binom.test(r, n) tibble(estimate = bt$estimate, conf.low = bt$conf.int[1], conf.high = bt$conf.int[2]) } hta_pred_obs <- hta_dat %>% mutate(studyc = paste(studyc, year)) %>% pivot_longer(matches("PASI(50|75|90)"), names_to = "outcome", values_to = "value", names_pattern = "(PASI(?:50|75|90))") %>% rowwise() %>% #group_by(studyc, trtc, outcome) %>% summarise(studyc, trtc, outcome, get_binom_ci(value, sample_size))%>% mutate(method = "Observed", outcomef = recode_factor(outcome, PASI50 = "PASI 50", PASI75 = "PASI 75", PASI90 = "PASI 90"), .trt = factor(trtc, levels = levels(hta_net$treatments)), .study = factor(studyc, levels = levels(hta_net$studies))) hta_pred_all <- bind_rows(hta_pred_mod, hta_pred_obs) %>% mutate(methodf = factor(method, levels = c("NMA", "Observed"))) # Plot ggplot(hta_pred_all, aes(x = estimate*100, xmin = conf.low*100, xmax = conf.high*100, y = forcats::fct_rev(.trt), shape = methodf, colour = methodf)) + geom_errorbar(width = 0, position = position_dodge(width = 0.75), size = 0.3) + geom_point(position = position_dodge(width = 0.75), fill = "white") + xlab("Percent achieving PASI outcome") + ylab("Treatment") + coord_cartesian(xlim = c(0, 100)) + facet_grid(.study~outcomef) + scale_shape_manual("Method", values = c(`NMA` = 16, Observed = 21)) + scale_colour_manual("Method", values = c(`NMA` = "#00468B", Observed = "#8B0000")) + theme_multinma() ```

The output plot with the current package version (0.5.1) with the issue looks like this, whereas the fixed version with the correct categories looks like this.

The issue is particularly noticeable in this example for studies omitting the first category (PASI 50).

dmphillippo commented 8 months ago

Fix merged into development branch, for next CRAN release.

dmphillippo commented 7 months ago

Now on CRAN, with the release of version 0.6.0