jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

Having a problem in reproducing the bounds graph. #200

Closed jongohkim91 closed 3 years ago

jongohkim91 commented 3 years ago

Dear Mr. Shea,

Hello my name is Jongoh Kim and I am trying to use the ivmte package for a project. To be familiar with the package, I have read your paper: "ivmte: An R Package for Implementing Marginal Treatment Effect Methods" and tried to reproduce the Bounds graph on page 12(Figure 1). To plot the graph, I copy-and-pasted the codes as written in the paper however the shape of the graph I got was slightly different. The upper bounds were a bit flatter than the original graph and I don't know why I am getting a different result.

My R version is 4.0.3 and I updated all the packages using the "Tools" bar in RStudio. I am using Gurobi 9.1.1 for the optimization. The code I used to produce the graph is as below.

if(!is.null(dev.list())) dev.off()
cat("\014") 
rm(list=ls())

library(dplyr)
library(stringr)
library(stringi)
library(haven)
library(ivmte)
library(gurobi)
library(ggplot2)

results <- ivmte(data = AE,
                 ivlike = worked ~ morekids + samesex + morekids*samesex,
                 target = "ate",
                 m0 = ~ u + I(u^2),
                 m1 = ~ u + I(u^2),
                 mte.inc = TRUE,
                 mte.ub = 0,
                 propensity = morekids ~ samesex,
                 noisy = FALSE)
results$bounds

p <- predict(results$propensity$model,
             newdata = data.frame(samesex = c(0,1)),
             type = "response")

loopivmte <- function(args, alphalist) {
  plotmat <- matrix(ncol = 3, nrow = length(alphalist))
  for (i in 1:length(alphalist)) {
    args[["genlate.lb"]] <- max(p[1] - alphalist[i], 0)
    args[["genlate.ub"]] <- min(p[2] + alphalist[i], 1)
    r <- do.call(ivmte, args)
    plotmat[i,] = c(alphalist[i], r$bound)
  }
  plotdf <- data.frame(plotmat)
  colnames(plotdf) <- c("a", "lb", "ub")
  return(plotdf)
}

args <- list(data = AE,
             ivlike = worked ~ morekids + samesex + morekids*samesex + yob,
             target = "genlate",
             m0 = ~ u + I(u^2),
             m1 = ~ u + I(u^2),
             propensity = morekids ~ samesex,
             noisy = FALSE)
alphalist <- seq(from = 0, to = max(p[1], (1-p[2])), by = .01)

plotquadratic <- loopivmte(args, alphalist)

args[["m0"]] <- ~ u + I(u^2) + I(u^3) + I(u^4)
args[["m1"]] <- args[["m0"]]
plotquartic <- loopivmte(args, alphalist)
plotdf <- merge(plotquadratic, plotquartic,
                by="a", suffixes = c(".quad", ".quartic"))

args[["m0"]] <- ~ uSplines(degree = 2,
                           knots = seq(from = .1, to = .9, by = .1))
args[["m1"]] <- args[["m0"]]
plotspline <- loopivmte(args, alphalist)
colnames(plotspline)[2:3] <- c("lb.spline", "ub.spline")
plotdf <- merge(plotdf, plotspline, by="a")

pl <- ggplot() +
  xlab(expression(paste("Extrapolation distance (", alpha, ")"))) +
  ylab("Bounds") +
  ggtitle("Bounds on Extrapolated LATEs") +
  theme_bw() +
  theme(plot.title = element_text(hjust = 0.5)) +
  geom_line(data = plotdf,
            aes(x = a, y = lb.quad, linetype="quadratic")) +
  geom_line(data = plotdf,
            aes(x = a, y = ub.quad, linetype="quadratic")) +
  geom_line(data = plotdf,
            aes(x = a, y = lb.quartic, linetype="quartic")) +
  geom_line(data = plotdf,
            aes(x = a, y = ub.quartic, linetype="quartic")) +
  geom_line(data = plotdf,
            aes(x = a, y = lb.spline, linetype="spline")) +
  geom_line(data = plotdf,
            aes(x = a, y = ub.spline, linetype="spline")) +
  labs(linetype = "MTRs") +
  scale_linetype_manual(breaks = c("quadratic", "quartic", "spline"),
                        values = c("solid", "twodash", "dotted")) +
  theme(legend.position = c(0.01, 0.01),
        legend.justification = c(0,0),
        legend.background = element_rect(fill = alpha('white', 0)),
        legend.key = element_rect(fill = alpha('white', 1)),
        legend.text = element_text(size=6),
        legend.title = element_text(size=8),
        legend.margin = margin(.05, .05, .05, .05, "cm"),
        legend.key.height = unit(.3, "cm"))
ggsave("extrapolation.pdf", width = 5, height = 3, units = "in")

Also, I have attached the pdf file of the graph to this post. extrapolation.pdf

Thank you in advance for your help.

Kind regards, Jongoh Kim

jkcshea commented 3 years ago

Hi Jongoh, Thanks for posting this. I'll see if I can figure out what's going on.

jkcshea commented 3 years ago

It looks like a very simple mistake, actually. You included yob when declaring the set of ivlike estimands. But yob is not in the specification used in the paper. That's why your bounds are different and tighter---you have an additional restriction. Once you remove yob from the ivlike formula, you can replicate the figure from the R paper.

jongohkim91 commented 3 years ago

Sorry for bothering you with a silly mistake. Thank you very much for taking time looking into the matter.

jkcshea commented 3 years ago

Nothing to apologize for, happy to have been able to help.