jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

Giving the user more information on problematic IV--like estimands #53

Closed a-torgovitsky closed 5 years ago

a-torgovitsky commented 5 years ago

Is there a way to give the user some more info? Simple example:

rm(list = ls())
library("AER")
devtools::load_all("../IVMTE/")

set.seed(5239348)
n <- 1000
u <- runif(n)
z1 <- rbinom(n, 1, .5)
z2 <- rbinom(n, 1, .3)
z <- z1 + z2
x <- rnorm(n)
d <- as.numeric(u < z*.25 + .01*x)

v0 <- rnorm(n)
m0 <- 0
y0 <- as.numeric(m0 + v0 + .1*x > 0)

v1 <- rnorm(n)
m1 <- .5
y1 <- as.numeric(m1 + v1 - .3*x > 0)

y <- d*y1 + (1-d)*y0

x2 <- as.numeric(x > .5)
x2 <- x2 + as.numeric(x > 1)

df <- data.frame(y,d,z,x,x2)

ivlike <- c(y ~ d + x2 | z + x2,
            y ~ d | z)

subsets <- l(x2 < 1, x2 < 1)

ivmte(data = df,
      ivlike = ivlike,
      subset = subsets,
      target = "ate",
      m0 = ~ u,
      m1 = ~ u,
      propensity = d ~ z
  )

The first specification is problematic because I am regressing y on d and x2, but also using a subsetting mask that restricts to the subpopulation with x2 < 1 -- which here is just x2 = 0. So there is no variation in x2.

But as the user I receive this:

Error in solve.default(ezx) : 
  Lapack routine dgesv: system is exactly singular: U[3,3] = 0

I could imagine that if I have many complicated specifications in an application that it's going to get really hard to track them down. Is there a way to just tell the user which specification is causing problems?

jkcshea commented 5 years ago

The error in the original example occurred because of collinearity. Having resolved #78, the original example is no longer problematic.

Below is an alternative example where the first subset condition restricts the sample to 0 observations. I'm not sure what kind of errors the user will be capable of generating, so I thought it would suffice to return the same error message that R would've returned when the regression fails, but also inform the user which IV specification is causing the problem.

> subsets <- l(x2 < -Inf, x2 < 1)
> 
> devtools::load_all("../IVMTE/")
Loading ivmte
> test <- ivmte(data = df,
+               ivlike = ivlike,
+               subset = subsets,
+               target = "ate",
+               m0 = ~ u,
+               m1 = ~ u,
+               propensity = d ~ z
+               )
Obtaining propensity scores...
Generating target moments...
    Integrating terms for control group...
    Integrating terms for treated group...
Generating IV-like moments...
Error: IV-like specification 1 generates the following error: 
0 (non-NA) cases. Check the specification and its corresponding 
subset condition.

Let me know if you'd like any adjustments to that.

a-torgovitsky commented 5 years ago

Perfect -- that's very informative!