jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

How to interpret this "A" matrix of the model? #191

Closed a-torgovitsky closed 3 years ago

a-torgovitsky commented 3 years ago
library(ivmte)

data <- fread("data_example.csv")

r <- ivmte(data = data,
           target = "ate",
           ivlike = YD ~ Z1,
           propensity = D ~ Z1,
           m1 = ~ 0 + uSplines(degree = 0, knots = c(.5)),
           m0 = ~ 1,
           m0.lb = 0,
           m0.ub = 0,
           audit.nu = 200)

print(r[["lp.result"]][["model"]][["A"]])

gives

9 x 7 sparse Matrix of class "dgCMatrix"
      1 2  3 4  (Intercept)    u1S1.1:1 u1S1.2:1
avec  1 1  1 1  .           .                  .
avec -1 1  . .  0.526911076 0.473088924        .
avec  . . -1 1 -0.002352183 0.002352183        .
      . .  . .  1.000000000 .                  .
      . .  . .  .           1.000000000        .
      . .  . .  .           .                  1
      . .  . .  1.000000000 .                  .
      . .  . .  .           1.000000000        .
      . .  . .  .           .                  1

A few questions:

  1. Why are there 7 columns? I would think this model has 1 parameter for m0, and 2 parameters for m1 (Edited: earlier I had counted wrong)
  2. Why are the row names "avec"? Is this intentional? Can we have more descriptive row names?
  3. What does the first row correspond to?
a-torgovitsky commented 3 years ago

Here's the data in case that's helpful. data_example.csv.zip

jkcshea commented 3 years ago

To answer the question in the title of the issue, this is the matrix that is used when estimating the bounds. So all the stuff related to the IV-like estimands and minimum criterion go into this matrix.

1. Why are there 7 columns? I would think this model has 1 parameter for m0, and 2 parameters for m1 (Edited: earlier I had counted wrong)

The first four columns are the slack variables to allow for violations of observational equivalence. If there are k elements in the S-set, there are 2 * k slack variables. And the remaining three columns are for the terms in the MTR.

2. Why are the row names "avec"? Is this intentional? Can we have more descriptive row names?

Ah, this was my mistake. @johnnybonney had the same questions before. But we did that overhaul of the audit procedure, and I must have removed the code that labeled all the rows and columns. I've rewritten it, here's an example of the matrix.

Below is the specification I used I've added in more constraints, and another IV-like specification, so you can see the conventions I'm using to label things.

r <- ivmte(data = data,
           target = "ate",
           ivlike = c(YD ~ Z1,
                      YD ~ 0 + Z1),
           propensity = D ~ Z1,
           m1 = ~ 0 + uSplines(degree = 0, knots =.5),
           m0 = ~ 0 + uSplines(degree = 0, knots =.5),
           mte.lb = -3,
           mte.ub = 3,
           m0.dec = TRUE,
           m1.inc = TRUE,
           mte.dec = TRUE,
           initgrid.nu = 2,
           audit.max = 5,
           audit.nu = 200)

Here is the A matrix you get now.

> library(Matrix)
> tmpMat <- Matrix(r$lp.result$model$A, sparse = FALSE)
> round(tmpMat, digits = 2)
28 x 10 Matrix of class "dgeMatrix"
                slack1- slack1+ slack2- slack2+ slack3- slack3+ u0S1.1:1
criterion             1       1       1       1       1       1     0.00
iv1.(Intercept)      -1       1       0       0       0       0     0.03
iv1.Z1                0       0      -1       1       0       0     0.00
iv2.Z1                0       0       0       0      -1       1     0.02
m0.lb                 0       0       0       0       0       0     1.00
m0.lb                 0       0       0       0       0       0     0.00
m0.lb                 0       0       0       0       0       0     0.00
m1.lb                 0       0       0       0       0       0     0.00
m1.lb                 0       0       0       0       0       0     0.00
m1.lb                 0       0       0       0       0       0     0.00
mte.lb                0       0       0       0       0       0    -1.00
mte.lb                0       0       0       0       0       0     0.00
mte.lb                0       0       0       0       0       0     0.00
m0.ub                 0       0       0       0       0       0     1.00
m0.ub                 0       0       0       0       0       0     0.00
m0.ub                 0       0       0       0       0       0     0.00
m1.ub                 0       0       0       0       0       0     0.00
m1.ub                 0       0       0       0       0       0     0.00
m1.ub                 0       0       0       0       0       0     0.00
mte.ub                0       0       0       0       0       0    -1.00
mte.ub                0       0       0       0       0       0     0.00
mte.ub                0       0       0       0       0       0     0.00
m0.dec                0       0       0       0       0       0    -1.00
m0.dec                0       0       0       0       0       0     0.00
m1.inc                0       0       0       0       0       0     0.00
m1.inc                0       0       0       0       0       0     0.00
mte.dec               0       0       0       0       0       0     1.00
mte.dec               0       0       0       0       0       0     0.00
                u0S1.2:1 u1S1.1:1 u1S1.2:1
criterion            0.0     0.00        0
iv1.(Intercept)      0.5     0.47        0
iv1.Z1               0.0     0.00        0
iv2.Z1               0.5     0.48        0
m0.lb                0.0     0.00        0
m0.lb                1.0     0.00        0
m0.lb                0.0     0.00        0
m1.lb                0.0     1.00        0
m1.lb                0.0     0.00        1
m1.lb                0.0     0.00        0
mte.lb               0.0     1.00        0
mte.lb              -1.0     0.00        1
mte.lb               0.0     0.00        0
m0.ub                0.0     0.00        0
m0.ub                1.0     0.00        0
m0.ub                0.0     0.00        0
m1.ub                0.0     1.00        0
m1.ub                0.0     0.00        1
m1.ub                0.0     0.00        0
mte.ub               0.0     1.00        0
mte.ub              -1.0     0.00        1
mte.ub               0.0     0.00        0
m0.dec               1.0     0.00        0
m0.dec              -1.0     0.00        0
m1.inc               0.0    -1.00        1
m1.inc               0.0     0.00       -1
mte.dec             -1.0    -1.00        1
mte.dec              1.0     0.00       -1

Column names

The first 2 * k column names are the slack variables. There are now 3 IV-like estimands, so we get 6 columns for slack variables. The numbers indicate which element of the S-set they correspond to (the indexing is the same as in names(r$s.set)). The +/- indicate if the sign of the slack.

The remaining columns are the terms of the MTRs.

Row names

The first row pertains to the constraint generated by the minimum criterion, and is now labeled as such.

The next 3 rows are for the 3 IV-like estimands, as indicated by the suffix iv. The numbers before the period indicate which IV-like specification they are from (so there are two components from the first specification, and one component from the second specification). What comes after the period is the component name.

And the remaining rows are for the shape constraints, and are labeled as such.

3. What does the first row correspond to?

The constraint derived from the minimum criterion.

a-torgovitsky commented 3 years ago

Thanks for the explanation! This is a great improvement with the labeling.