Different (but equivalent) MTR specifications give different results

It seems to me that the following are all equivalent ways of specifying the same regression:

y ~ factor(x) + factor(x):u
y ~ 0 + factor(x) + factor(x):u
y ~ factor(x) * u
y ~ 0 + factor(x) * u

If I run lm with these type of specifications, they will all generate the same fitted values for y (though the coefficients will be different, since different variables will be omitted).

However, when I use these types of formulas for m1 and m0, I get different bounds:

library(ivmte)

# error doesn't reproduce if I use all years
dt <- AE[AE$yob < 55,]

args <- list(data = dt,
             outcome = "worked",
             propensity = morekids ~ factor(samesex)*factor(yob),
             target = 'late',
             late.to = list(samesex = 1),
             late.from = list(samesex = 0),
             solver = "gurobi",
             direct = "lp",
             initgrid.nu = 100,
             audit.nu = 200)  

# different types of MTR specifications - equivalent?
mtr_v1 <- ~factor(yob) + factor(yob):u + factor(yob):I(u^2)
mtr_v2 <- ~0 + factor(yob) + factor(yob):u + factor(yob):I(u^2)
mtr_v3 <- ~factor(yob)*u + factor(yob):I(u^2)
mtr_v4 <- ~0 + factor(yob)*u + factor(yob):I(u^2)

args$m1 <- mtr_v1
args$m0 <- mtr_v1
est1 <- do.call(ivmte, args)

gives an error,

    Audit count: 1
    Minimum criterion: 2.681385e-06
    Obtaining bounds...
    Model was infeasible or unbounded.
Error:  The minimization problem was proven to be infeasible or unbounded. Since a minimum criterion was found, the model is most likely feasible but unbounded. This can happen if the initial grid is too small. Try increasing the parameters 'initgrid.nx' and 'initgrid.nu'. If the model is indeed infeasible, consider exporting the model and passing it to the solver for more details.

(Perhaps related to #223?) The other specifications give bounds, but the bounds differ:

args$m1 <- mtr_v2
args$m0 <- mtr_v2
est2 <- do.call(ivmte, args)
print(est2)

Bounds on the target parameter: [-0.08715786, -0.08761145]
Audit terminated successfully after 1 round

args$m1 <- mtr_v3
args$m0 <- mtr_v3
est3 <- do.call(ivmte, args)
print(est3)

Bounds on the target parameter: [-0.08306298, -0.0857325]
Audit terminated successfully after 2 rounds

args$m1 <- mtr_v4
args$m0 <- mtr_v4
est4 <- do.call(ivmte, args)
print(est4)

Bounds on the target parameter: [-0.08651154, -0.08750122]
Audit terminated successfully after 1 round

Also, the lower bounds are greater than the upper bounds? I'll post a separate issue for this...

The bounds are all different. Is this something that we should expect, given the regression-based approach? We have discussed something similar before, about how equivalent information sets can yield different results (see the discussion around https://github.com/jkcshea/ivmte/issues/153#issuecomment-556086409).

If this is to be expected, is there any sort of best practice? Given that I was looking to point-identify the LATE here (using direct = 'lp'), the second one is closest to what I want? But it seems kind of arbitrary.

What's the minimum criterion across these different approaches?

In each of these approaches, the minimum criterion is the same, 2.681385e-06. It surprises me that it is not exactly zero, since we have 4 moments per yob and 6 parameters per yob (so should be able to match the moments perfectly, I believe).

If you run the regression manually and check the fit (like the exercise I did in #221) how close to get to matching $E[Y|D,X,Z]$ for all values of $D, X, Z$?

Do you know you have "support" for this type of saturated regression? I would expect yes given the data, but I get weird things when changing your sample selection criteria:

library(ivmte)

dt <- AE[AE$yob < 40,]

args <- list(data = dt,
             outcome = "worked",
             propensity = morekids ~ factor(samesex)*factor(yob),
             target = 'late',
             late.to = list(samesex = 1),
             late.from = list(samesex = 0),
             solver = "gurobi",
             direct = "lp",
             initgrid.nu = 100,
             audit.nu = 200)

# different types of MTR specifications - equivalent?
mtr_v1 <- ~factor(yob) + factor(yob):u + factor(yob):I(u^2)

args$m1 <- mtr_v1
args$m0 <- mtr_v1
est1 <- do.call(ivmte, args)

yields the cryptic error

Error in if (var(data[, treat]) == 0) { : 
  missing value where TRUE/FALSE needed

while

dt <- AE[AE$yob < 45,]

args <- list(data = dt,
             outcome = "worked",
             propensity = morekids ~ factor(samesex)*factor(yob),
             target = 'late',
             late.to = list(samesex = 1),
             late.from = list(samesex = 0),
             solver = "gurobi",
             direct = "lp",
             initgrid.nu = 100,
             audit.nu = 200)

# different types of MTR specifications - equivalent?
mtr_v1 <- ~factor(yob) + factor(yob):u + factor(yob):I(u^2)

args$m1 <- mtr_v1
args$m0 <- mtr_v1
est1 <- do.call(ivmte, args)

yields the cryptic error

Obtaining propensity scores...
Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

You're getting the cryptic error because the minimum yob is 44, so ivmte first gets upset that you are asking for estimation with a dataset that has 0 observations :wink: And then in your second example, it seems that ivmte requires a factor to have multiple levels (since there is no variation in yob). You'll get the same error if you try to use a factor with only one level in, e.g., lm.

Do you know you have "support" for this type of saturated regression?

Yep, there is support for this kind of regression---at least, there are 400+ observations for each (X, D, Z) bin, and there is variation in Y in each of those bins.

For the three regressions that worked initially, I reran things manually. They all seem to reproduce the moments within 1e-11. (Let me know if you were expecting me to do something differently.)

setDT(dt)
ey <- dt[, .(mean(worked)), by = .(yob, morekids, samesex)]$V1

B2 <- est2$X
B3 <- est3$X
B4 <- est4$X

# manually produce B2
dt$p <- est2$propensity$phat
Bm2 <- matrix(data = NA, nrow = nrow(B2), ncol = ncol(B2))

for (i in 1:11) {
  Bm2[,i] <- (1 - dt$morekids)*(dt$yob == (i + 43))
  Bm2[,i+11] <- (1/2)*(1 + dt$p)*(1 - dt$morekids)*(dt$yob == (i + 43))
  Bm2[,i+22] <- (1/3)*(1 - dt$p^3)/(1 - dt$p)*(1 - dt$morekids)*(dt$yob == (i + 43))
  Bm2[,i+33] <- dt$morekids*(dt$yob == (i + 43))
  Bm2[,i+44] <- dt$morekids*dt$p/2*(dt$yob == (i + 43))
  Bm2[,i+55] <- dt$morekids*dt$p^2/3*(dt$yob == (i + 43))
}

# manually produce B3
Bm3 <- matrix(data = NA, nrow = nrow(B3), ncol = ncol(B3))

Bm3[,1] <- (1 - dt$morekids)
Bm3[,12] <- (1/2)*(1 + dt$p)*(1 - dt$morekids)
Bm3[,34] <- dt$morekids
Bm3[,45] <- dt$morekids*dt$p/2
for (i in 1:11) {
  if (i != 1) {
    Bm3[,i] <- (1 - dt$morekids)*(dt$yob == (i + 43))
    Bm3[,i+11] <- (1/2)*(1 + dt$p)*(1 - dt$morekids)*(dt$yob == (i + 43))
    Bm3[,i+33] <- dt$morekids*(dt$yob == (i + 43))
    Bm3[,i+44] <- dt$morekids*dt$p/2*(dt$yob == (i + 43))
  }
  Bm3[,i+22] <- (1/3)*(1 - dt$p^3)/(1 - dt$p)*(1 - dt$morekids)*(dt$yob == (i + 43))
  Bm3[,i+55] <- dt$morekids*dt$p^2/3*(dt$yob == (i + 43))
}

# manually produce B4
Bm4 <- matrix(data = NA, nrow = nrow(B4), ncol = ncol(B4))
Bm4[,12] <- (1/2)*(1 + dt$p)*(1 - dt$morekids)
Bm4[,45] <- dt$morekids*dt$p/2
for (i in 1:11) {
  if (i != 1) {
    Bm4[,i+11] <- (1/2)*(1 + dt$p)*(1 - dt$morekids)*(dt$yob == (i + 43))
    Bm4[,i+44] <- dt$morekids*dt$p/2*(dt$yob == (i + 43))
  }
  Bm4[,i] <- (1 - dt$morekids)*(dt$yob == (i + 43))
  Bm4[,i+22] <- (1/3)*(1 - dt$p^3)/(1 - dt$p)*(1 - dt$morekids)*(dt$yob == (i + 43))
  Bm4[,i+33] <- dt$morekids*(dt$yob == (i + 43))
  Bm4[,i+55] <- dt$morekids*dt$p^2/3*(dt$yob == (i + 43))
}

### verify that I've reproduced B matrices, and check fit

# method 2
diff2 <- abs(B2 - Bm2)

dfB2 <- as.data.frame(B2)
dfB2 <- cbind(dt$worked, dfB2)
names(dfB2) <- c("worked", paste0("B", 1:66))
B2_form <- as.formula(paste0("worked ~ 0 + ", paste0("B", 1:66, collapse = "+")))

r2 <- lm(data = dfB2, B2_form)
fit2 <- predict(r2)
mdiff2 <- abs(unique(fit2) - ey)

# method 3
diff3 <- abs(B3 - Bm3)

dfB3 <- as.data.frame(B3)
dfB3 <- cbind(dt$worked, dfB3)
names(dfB3) <- c("worked", paste0("B", 1:66))
B3_form <- as.formula(paste0("worked ~ 0 + ", paste0("B", 1:66, collapse = "+")))

r3 <- lm(data = dfB3, B3_form)
fit3 <- predict(r3)
mdiff3 <- abs(unique(fit3) - ey)

# method 4
diff4 <- abs(B4 - Bm4)

dfB4 <- as.data.frame(B4)
dfB4 <- cbind(dt$worked, dfB4)
names(dfB4) <- c("worked", paste0("B", 1:66))
B4_form <- as.formula(paste0("worked ~ 0 + ", paste0("B", 1:66, collapse = "+")))

r4 <- lm(data = dfB4, B4_form)
fit4 <- predict(r4)
mdiff4 <- abs(unique(fit4) - ey)

# summary
message(paste("Method 2: Bmat maximum difference is", max(diff2), "."))
message(paste("Method 3: Bmat maximum difference is", max(diff3), "."))
message(paste("Method 4: Bmat maximum difference is", max(diff4), "."))

message(paste("Method 2: moment maximum difference is", max(mdiff2), "."))
message(paste("Method 3: moment maximum difference is", max(mdiff3), "."))
message(paste("Method 4: moment maximum difference is", max(mdiff4), "."))

Method 2: Bmat maximum difference is 1.11022302462516e-16 .
Method 3: Bmat maximum difference is 1.11022302462516e-16 .
Method 4: Bmat maximum difference is 1.11022302462516e-16 .

Method 2: moment maximum difference is 8.44213587924969e-13 .
Method 3: moment maximum difference is 1.50346401994739e-12 .
Method 4: moment maximum difference is 8.46211989369294e-13 .

Whoops! My mistake, I had thought of yob as age.

Anyway, I would say let's put this down until resolving #223. That's a bug with a clear issue, and as you said the issues here might be related to that.

Although, what happens if you make the example simpler with fewer values of x? Leaving a more minimal MWE for ourselves would be useful

This is very related to #225, I think. For example, if I run my initial example but with AE[AE$yob < 54,](removing the pesky yob = 54 data), I get bounds of [-0.1035615, -0.1035615] (min. criterion of 0) for all 4 ways of specifying the MTRs.

So I agree that we should set this aside---until #223 and #225 are resolved.

But here is a more minimal (though still long) MWE of the original issue, limiting to just two values of the covariates (including yob = 54):

library(ivmte)

dt <- AE[AE$yob %in% 53:54,]

args <- list(data = dt,
             outcome = "worked",
             propensity = morekids ~ factor(samesex)*factor(yob),
             target = 'late',
             late.to = list(samesex = 1),
             late.from = list(samesex = 0),
             solver = "gurobi",
             direct = "lp",
             initgrid.nu = 100,
             audit.nu = 200)  

# different types of MTR specifications - equivalent?
mtr_v1 <- ~factor(yob) + factor(yob):u + factor(yob):I(u^2)
mtr_v2 <- ~0 + factor(yob) + factor(yob):u + factor(yob):I(u^2)
mtr_v3 <- ~factor(yob)*u + factor(yob):I(u^2)
mtr_v4 <- ~0 + factor(yob)*u + factor(yob):I(u^2)

args$m1 <- mtr_v1
args$m0 <- mtr_v1
est1 <- do.call(ivmte, args)

gives output

Obtaining propensity scores...

Generating target moments...
    Integrating terms for control group...
    Integrating terms for treated group...

Performing direct MTR regression...
    MTR is not point identified.

Performing audit procedure...
    Solver: Gurobi ('gurobi')
    Generating initial constraint grid...

    Audit count: 1
    Minimum criterion: 4.460409e-05
    Obtaining bounds...
    Model was infeasible or unbounded.
Error:  The minimization problem was proven to be infeasible or unbounded. Since a minimum criterion was found, the model is most likely feasible but unbounded. This can happen if the initial grid is too small. Try increasing the parameters 'initgrid.nx' and 'initgrid.nu'. If the model is indeed infeasible, consider exporting the model and passing it to the solver for more details.

and

args$m1 <- mtr_v2
args$m0 <- mtr_v2
est2 <- do.call(ivmte, args)

gives output

Obtaining propensity scores...

Generating target moments...
    Integrating terms for control group...
    Integrating terms for treated group...

Performing direct MTR regression...
    MTR is not point identified.

Performing audit procedure...
    Solver: Gurobi ('gurobi')
    Generating initial constraint grid...

    Audit count: 1
    Minimum criterion: 4.343841e-05
    Obtaining bounds...
    Violations: 0
    Audit finished.

Bounds on the target parameter: [0.04038411, 0.045664]

and

args$m1 <- mtr_v3
args$m0 <- mtr_v3
est3 <- do.call(ivmte, args)

gives output

Obtaining propensity scores...

Generating target moments...
    Integrating terms for control group...
    Integrating terms for treated group...

Performing direct MTR regression...
    MTR is not point identified.

Performing audit procedure...
    Solver: Gurobi ('gurobi')
    Generating initial constraint grid...

    Audit count: 1
    Minimum criterion: 5.00063e-05
    Obtaining bounds...
    Violations: 0
    Audit finished.

Bounds on the target parameter: [0.08569463, 0.1012207]

and

args$m1 <- mtr_v4
args$m0 <- mtr_v4
est4 <- do.call(ivmte, args)

gives output

Obtaining propensity scores...

Generating target moments...
    Integrating terms for control group...
    Integrating terms for treated group...

Performing direct MTR regression...
    MTR is not point identified.

Performing audit procedure...
    Solver: Gurobi ('gurobi')
    Generating initial constraint grid...

    Audit count: 1
    Minimum criterion: 4.394548e-05
    Obtaining bounds...
    Model was infeasible or unbounded.
Error:  The minimization problem was proven to be infeasible or unbounded. Since a minimum criterion was found, the model is most likely feasible but unbounded. This can happen if the initial grid is too small. Try increasing the parameters 'initgrid.nx' and 'initgrid.nu'. If the model is indeed infeasible, consider exporting the model and passing it to the solver for more details.

The point-identified target parameter is given by

late_54 <- with(dt[dt$yob == 54], cov(worked, samesex) / cov(morekids, samesex))
late_53 <- with(dt[dt$yob == 53], cov(worked, samesex) / cov(morekids, samesex))
pr_54 <- mean(dt$yob == 54)
poplate <- pr_54 * late_54 + (1 - pr_54) * late_53
message(paste("True population-averaged LATE:", round(poplate, 5)))

True population-averaged LATE: 0.20645

jkcshea / ivmte

Different (but equivalent) MTR specifications give different results #224