jkcshea / ivmte

An R package for implementing the method in Mogstad, Santos, and Torgovitsky (2018, Econometrica).
GNU General Public License v3.0
18 stars 2 forks source link

Unable to satisfy optimality tolerance #196

Closed a-torgovitsky closed 3 years ago

a-torgovitsky commented 3 years ago
library("ivmte")

results <- ivmte(data = AE,
                 target = "att",
                 outcome = "worked",
                 m0 = ~ u + I(u^2) + yob + u*yob,
                 m1 = ~ u + I(u^2) + I(u^3) + yob + u*yob,
                 propensity = morekids ~ samesex,
                 noisy = TRUE)

produces

LP solver: Gurobi ('gurobi')

Obtaining propensity scores...

Generating target moments...
    Integrating terms for control group...
    Integrating terms for treated group...

Performing direct MTR regression...
    MTR is not point identified.

Performing audit procedure...
    Generating initial constraint grid...

    Audit count: 1
    Obtaining bounds...
Warning:  The LP solver was unable to satisfy the optimality tolerance for the maximization problem, so a suboptimal solution is returned. Tolerance parameters for the LP solver can be passed through the argument 'lpsolver.options'. 
    Violations:  1 
    Expanding constraint grid to include 1 additional point...

    Audit count: 2
    Obtaining bounds...
Warning:  The LP solver was unable to satisfy the optimality tolerance for the minimization problem, so a suboptimal solution is returned. The LP solver was unable to satisfy the optimality tolerance for the maximization problem, so a suboptimal solution is returned. Tolerance parameters for the LP solver can be passed through the argument 'lpsolver.options'. 
    Violations: 0
    Audit finished.

Bounds on the target parameter: [-0.2750446, 0.03653716]

Any idea what's going on here?

> print(c(results$lp.result$maxstatus, results$lp.result$minstatus))
[1] 6 6

Assuming I am looking at the right status code list, this is:

CUTOFF | 6 | Optimal objective for model was proven to be worse than the value specified in the Cutoff parameter.

But we are not setting the cutoff parameter, are we?

a-torgovitsky commented 3 years ago

Also, could you remind me how I can access the Gurobi output?

jkcshea commented 3 years ago

Assuming I am looking at the right status code list, this is:

Argh, I'm sorry, I had forgotten to fix the labeling of statuses. Since Gurobi and CPLEX use different status codes, I had to put them together. So the numbers I use won't align with either of their keys. I meant to correct the status labels so they aren't just numbers, I'll take care of that.

But the warning message correctly reports what those numbers mean. i.e. The Gurobi status was "SUBOPTIMAL", which their key describes as meaning "Unable to satisfy optimality tolerances; a sub-optimal solution is available." So no cutoff parameter is involved.

Any idea what's going on here?

I also came across that error while writing those examples in #194, but haven't yet been able to resolve it. Messing with the tolerances, (e.g. passing the argument lpsolver.options = list(OptimalityTol = 1e-2)) won't resolve it either. I haven't figured out where the problem is, though.

Also, could you remind me how I can access the Gurobi output?

If you set debug = TRUE, it will print out the Gurobi output. It will also save the model so you can pass it into Gurobi directly. You will have to tell Gurobi whether you want to maximize or minimize, though.

a-torgovitsky commented 3 years ago

As usual, it is probably related to the scaling:

Upper bound optimization statistics:
------------------------------------
Gurobi Optimizer version 9.0.2 build v9.0.2rc0 (linux64)
Optimize a model with 1321 rows, 11 columns and 7052 nonzeros
Model fingerprint: 0x1d4fa639
Model has 1 quadratic constraint
Coefficient statistics:
  Matrix range     [3e-05, 6e+01]
  **QMatrix range    [7e+00, 4e+08]**
  QLMatrix range   [6e+02, 8e+06]
  Objective range  [1e-02, 5e+01]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+00, 1e+00]
  QRHS range       [6e+04, 6e+04]
**Warning: Quadratic constraint contains large coefficients**
         Consider reformulating model or setting NumericFocus parameter
         to avoid numerical issues.
Presolve removed 661 rows and 0 columns
Presolve time: 0.00s
Presolved: 670 rows, 681 columns, 4239 nonzeros
Presolved model has 1 second-order cone constraint
Ordering time: 0.00s

Barrier statistics:
 Dense cols : 11
 Free vars  : 11
 AA' NZ     : 3.613e+03
 Factor NZ  : 4.416e+03 (roughly 1 MByte of memory)
 Factor Ops : 3.036e+04 (less than 1 second per iteration)
 Threads    : 1

                  Objective                Residual
Iter       Primal          Dual         Primal    Dual     Compl     Time
   0  -8.85171531e-02  1.65000000e+01  4.68e+02 1.00e-01  2.95e+01     0s
   1   6.51936883e+00  1.28646612e+01  7.89e+01 2.44e-02  5.01e+00     0s
...
56   3.68213784e-02  4.36303763e-02  1.74e-05 1.34e-07  5.15e-06     0s
  57   3.68218289e-02  4.36303471e-02  3.98e-05 9.25e-06  5.15e-06     0s
  58   3.68413415e-02  4.36303182e-02  2.44e-05 2.57e-06  5.14e-06     0s

Barrier performed 58 iterations in 0.06 seconds
Sub-optimal termination - objective 3.65371631e-02

The quadratic coefficients should effectively be the X'X matrix from linear regression. (Except in the notation of my note, it is BB', since I wrote the problem in the population and didn't construct the data matrices for a finite sample.)

Trying something quick like scaling yob sort of helped -- dividing it by 1000 reduced the range of the coefficients, but at least for the maximization problem I still have "sub-optimal" termination.

a-torgovitsky commented 3 years ago

@jkcshea is there a relatively easy way for me to access the "B" matrix?

jkcshea commented 3 years ago

Ah, I just made a change to make this easier. If you update the package, then you can now get the B matrix from results$X (I can rename it to B if you prefer).

If you don't update, then results$X will still works for the point identified case. But the partially identified case is less organized:

a-torgovitsky commented 3 years ago

Something I don't understand in this example.

The output with debug = TRUE is attached. issue196.txt

The first minimum criterion looks like this:

Minimum criterion optimization statistics:
------------------------------------------
Gurobi Optimizer version 9.0.2 build v9.0.2rc0 (linux64)
Optimize a model with 1320 rows, 11 columns and 7050 nonzeros
Model fingerprint: 0x0df7f52e
Model has 36 quadratic objective terms
Coefficient statistics:
  Matrix range     [3e-05, 6e+01]
  Objective range  [6e+02, 8e+06]
  QObjective range [1e+01, 9e+08]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+00, 1e+00]
Warning: Model contains large quadratic objective coefficients
         Consider reformulating model or setting NumericFocus parameter
         to avoid numerical issues.
Warning: diagonal adjustment of 2.4e-07 needed to make Q PSD
Warning: diagonal adjustment of 2.4e-07 performed to make Q PSD
Warning: diagonal adjustment of 2.4e-07 performed to make Q PSD
Presolve removed 660 rows and 0 columns
Presolve time: 0.00s
Presolved: 660 rows, 671 columns, 4185 nonzeros
Presolved model has 36 quadratic objective terms
Ordering time: 0.00s

Barrier statistics:
 Dense cols : 11
 Free vars  : 19
 AA' NZ     : 3.557e+03
 Factor NZ  : 4.261e+03 (roughly 1 MByte of memory)
 Factor Ops : 2.740e+04 (less than 1 second per iteration)
 Threads    : 1

Then the second one (the first in the audit iteration) looks like this:

Minimum criterion optimization statistics:
------------------------------------------
Warning: Q constraint 0 doesn't have a name
Warning: default Q constraint names used to write mps file
Gurobi Optimizer version 9.0.2 build v9.0.2rc0 (linux64)
Optimize a model with 1321 rows, 11 columns and 7052 nonzeros
Model fingerprint: 0xd686686f
Model has 36 quadratic objective terms
Model has 1 quadratic constraint
Coefficient statistics:
  Matrix range     [3e-05, 6e+01]
  QMatrix range    [7e+00, 4e+08]
  QLMatrix range   [6e+02, 8e+06]
  Objective range  [6e+02, 8e+06]
  QObjective range [1e+01, 9e+08]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+00, 1e+00]
  QRHS range       [6e+04, 6e+04]
Warning: Quadratic constraint contains large coefficients
Warning: Model contains large quadratic objective coefficients
         Consider reformulating model or setting NumericFocus parameter
         to avoid numerical issues.
Presolve removed 661 rows and 0 columns
Presolve time: 0.00s
Presolved: 679 rows, 691 columns, 4281 nonzeros
Presolved model has 2 second-order cone constraints
Ordering time: 0.00s

Barrier statistics:
 Dense cols : 11
 Free vars  : 11
 AA' NZ     : 3.681e+03
 Factor NZ  : 4.560e+03 (roughly 1 MByte of memory)
 Factor Ops : 3.272e+04 (less than 1 second per iteration)
 Threads    : 1

Notice that the second one has a quadratic constraint whereas the first one does not. So is the second audit still imposing a quadratic constraint, perhaps leftover from the previous bounds iteration? Maybe you forgot to reset Gurobi between audits?

jkcshea commented 3 years ago

Ah, you're right, I did indeed forget to remove the quadratic constraint between each iteration of the audit. This has been fixed!