jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

Stochastic audits #207

Closed jkcshea closed 3 years ago

jkcshea commented 3 years ago

Here is the original example illustrating the problem. audit-stochastic.pdf

I am still working on this, but in case you have any immediate hunches:

> ## Use a subset of the data (faster)
> set.seed(10)
> AE.small <- AE[sample(seq(nrow(AE)), size = 1000, replace = FALSE), ]
> ## Set up the arguments
> args <- list(data = AE.small,
+              outcome = "worked",
+              target = "ate",
+              m0 = ~ uSplines(degree = 3, knots = seq(from = .1, to = .9, by = .1)) + yob +
+                      black + hisp + other,
+              m1 = ~ uSplines(degree = 3, knots = seq(from = .1, to = .9, by = .1)) + yob +
+                      black + hisp + other,
+              propensity = morekids ~ samesex + yob + black + hisp + other,
+              audit.nu = 50,
+              initgrid.nx = 20,
+              initgrid.nu = 20,
+              solver = 'gurobi')
> ## Estimate using 20 x 20 initial grid.
> set.seed(10)
> do.call(ivmte, args)

Bounds on the target parameter: [-0.2734294, 0.07024355]
Audit terminated successfully after 3 rounds 

> ## Estimate using 21 x 20 initial grid.
> set.seed(10)
> args$initgrid.nx <- 21
> do.call(ivmte, args)

Bounds on the target parameter: [-0.2738942, 0.07031611]
Audit terminated successfully after 3 rounds 
a-torgovitsky commented 3 years ago

This is the key clue I think:

This does not happen for the non-regression approaches.

That suggests to me it's a numerical issue concerning QPs and QCQPs.

My guess would be that both initial grids are leading to different solutions that pass the audit grid and lead to criteria that are ever-so-slightly different (and below Gurobi's optimality tolerance). Then these slight differences in criteria lead to noticeable differences in the bounds in step 2.

jkcshea commented 3 years ago

My guess would be that both initial grids are leading to different solutions that pass the audit grid and lead to criteria that are ever-so-slightly different (and below Gurobi's optimality tolerance). Then these slight differences in criteria lead to noticeable differences in the bounds in step 2.

This indeed seems to be the case. I ran a simulation comparing the bounds generated from two different initial grids. There are no controls, so the differences are only because initgrid.nu​ is different for the two grids. git-example.zip

The different initial grids result in different criterions and different solutions. If I adjust the criterion of one problem to match the other, then the differences in the bounds usually shrink (68% of the time for lower bound, 64% of the time for upper bound). But the bounds are not identical because the initial grids still differ. Making the initial grids the same will result in the same QP/QCQP problems, and thus the same bounds.

Some additional notes:

a-torgovitsky commented 3 years ago

Sounds like this is just another issue with the QP/QCQP stability we discussed a while back. I don't think we need to revisit that problem. So I'm going to close this issue.

I lowered criterion.tol to 1e-4 a while back because 1e-2 was giving implausibly wide bounds in the AE example. (Wider than the Manski bounds for example.) The "right" value clearly depends on features of the problem, so without theory to suggest a good value, I just assume err on the side of numerical problems instead of useless bounds.