markmfredrickson / optmatch

Functions for optimal matching in R
https://markmfredrickson.github.io/optmatch
Other
47 stars 14 forks source link

Possible Plot Ideas for Optmatch Objects #207

Open jwbowers opened 3 years ago

jwbowers commented 3 years ago

Hi Y'all,

This is just a prototype of what graphs of optmatch objects could look like (obviously nicer if using ggplot etc..)

Rplot

## perhaps try this https://briatte.github.io/ggnet/#example-2-bipartite-network next time
library(igraph)
blah0 <- outer(fm0, fm0, FUN = function(x, y) {
  as.numeric(x == y)
})
blah1 <- outer(fm1, fm1, FUN = function(x, y) {
  as.numeric(x == y)
})
blah2 <- outer(fm2, fm2, FUN = function(x, y) {
  as.numeric(x == y)
})
par(mfrow = c(2, 2), mar = c(3, 3, 3, 1))
plot(graph_from_adjacency_matrix(blah0, mode = "undirected", diag = FALSE),
  vertex.color = c("white", "green")[meddat$nhTrt + 1], main = "Min Ctrls=0, Max Ctrls=Inf"
)
plot(graph_from_adjacency_matrix(blah1, mode = "undirected", diag = FALSE),
  vertex.color = c("white", "green")[meddat$nhTrt + 1], main = "Min Ctrls=1, Max Ctrls=Inf"
)
plot(graph_from_adjacency_matrix(blah2, mode = "undirected", diag = FALSE),
  vertex.color = c("white", "green")[meddat$nhTrt + 1], main = "Penalties,Min Ctrls=1, Max Ctrls=Inf"
)
josherrickson commented 2 years ago

Another idea I came up with.

Screen Shot 2022-06-05 at 1 59 23 PM

Distance is created with mean, but offers other options. X-axis gives a good visual of how far from 1:1 a match is; y-axis can help identify poor matches.

Messy code below. One downside is that the match is needed for distance calculations; perhaps revisit our choice to store only a hash of the distance matrix in an optmatch object?

library(ggplot2)

plot.optmatch <- function(optm, match, distance_function = mean) {

  # Calculate matched distances and apply `distance_function` to them
  mtchdists <- matched.distances(optm, match)
  mtchdists <- as.data.frame(do.call(rbind,
                                     lapply(mtchdists, distance_function)))
  names(mtchdists) <- "dist"
  mtchdists$names <- row.names(mtchdists)

  # Calculate table of # treatment and # control
  txtctl <- tapply(names(optm), optm, function(x) x)
  txtctl <- lapply(txtctl, function(x) {
    as.numeric(x %in% row.names(match))
  })
  txtctl <- data.frame(txt = vapply(txtctl, sum, numeric(1)),
                       ctl = vapply(txtctl, function(x) sum(1-x), numeric(1)))
  txtctl$names <- row.names(txtctl)

  # Merge to single data set
  alldata <- merge(txtctl, mtchdists, by.x = "names")

  # Generate X position. 1:1 is at 0, 1:k is at 1, 2, etc, and j:1 is at -1, -2,
  # etc
  alldata$x <- alldata$ctl - 1
  revdir <- alldata$ctl < alldata$txt
  alldata$x[revdir] <- -1*(alldata$txt[revdir] - 1)

  # Breaks and labels for x axis
  brks <- c(-1*rev(seq_len(max(alldata$txt) - 1)),
            0,
            seq_len(max(alldata$ctl) - 1))
  lbls <- c(paste0(rev(seq_len(max(alldata$txt))), ":1"),
            paste0("1:", seq_len(max(alldata$ctl)))[-1])

  ggplot(alldata, aes(x = x, y = dist)) +
    geom_vline(aes(xintercept = 0), size = 2, alpha = .2) +
    geom_point() + 
    scale_x_continuous(breaks = brks,
                       labels = lbls, name = "Match Sizes") +
    scale_y_continuous(name = "Distance in Match")

}

match <- match_on(ct ~ cost, data = nuclearplants)
fm <- fullmatch(match, data = nuclearplants)

plot(fm, match, mean)
plot(fm, match, max)