Pyomo Versions > 6.4.4 result in crashes for several ASL-based MINLP solvers (believed to be related to NL Writer)

grahamsparrow-8451 commented 3 months ago

Summary

When trying to solve the MINLP problem in the code below, Pyomo 6.4.4 succeeds with the default NL Writer, but fails when switched to use version 2 NL Writer. Newer versions that default to version 2 of NL Writer always fail.

Steps to reproduce the issue

The following code replicates the problem.

ipopt solves the relaxation of the MINLP problem - just for reference
scip does not crash, but it gives a different answer with the V2 NL Writer
bonmin and couenne both fail with V2 NL Writer

Pyomo 6.7.1 gives the same behavior as Pyomo 6.4.4 with V2 NL Writer

from pyomo.environ import *
import pyomo.repn.plugins.nl_writer as nl_writer

def create_model():
    model = ConcreteModel()

    model.p1 = Var(within=PositiveReals, bounds=(0.85, 1.15))
    model.p2 = Var(within=PositiveReals, bounds=(0.68, 0.92))
    model.c1 = Var(within=NonNegativeReals, bounds=(-0.0, 0.7))
    model.c2 = Var(within=NonNegativeReals, bounds=(-0.0, 0.7))
    model.t1 = Var(within=Boolean, bounds=(0, 1))
    model.t2 = Var(within=Boolean, bounds=(0, 1))
    model.const = Constraint(expr=((0.7 - (model.c1*model.t1 + model.c2*model.t2)) <= (model.p1*model.t1 + model.p2*model.t2)))
    model.OBJ = Objective(expr=(model.p1*model.t1 + model.p2*model.t2))

    return model

def solve(solver, nl_version):
    if pyomo.version.version_info[0:2] == (6,4):
        nl_writer._activate_nl_writer_version(nl_version)
    else:
        if nl_version == 1:
            return
    model = create_model()
    opt = SolverFactory(solver)
    try:
        opt.solve(model, tee=False)
    except Exception as e:
        print(f"{opt.name}:{opt.options.solver}:nl_version_{nl_version} {e}")    
        return
    print(f"{opt.name}:{opt.options.solver}:nl_version_{nl_version} {value(model.OBJ)}")

for nl_version in [1, 2]:
    for solver in ["ipopt", "scip", "bonmin", "couenne"]:
        solve(solver, nl_version)

Error Message

Pyomo 6.4.4

ipopt:None:nl_version_1 0.34492753275643157
scip:None:nl_version_1 0.68
asl:bonmin:nl_version_1 0.68
asl:couenne:nl_version_1 0.68
ipopt:None:nl_version_2 0.34492753275643157
scip:None:nl_version_2 0.699999999
ERROR: Solver (asl) returned non-zero return code (-6)
ERROR: Solver log: Bonmin 1.8.9 using Cbc 2.10.8 and Ipopt 3.12.13 malloc():
    invalid size (unsorted) bonmin:
asl:bonmin:nl_version_2 Solver (asl) did not exit normally
ERROR: Solver (asl) returned non-zero return code (-6)
ERROR: Solver log: Couenne 0.5.8 -- an Open-Source solver for Mixed Integer
    Nonlinear Optimization Mailing list: couenne@list.coin-or.org
    Instructions: http://www.coin-or.org/Couenne corrupted size vs. prev_size
    couenne:
asl:couenne:nl_version_2 Solver (asl) did not exit normally

Pyomo 6.7.1

ipopt:None:nl_version_2 0.34492753275643157
scip:None:nl_version_2 0.699999999
ERROR: Solver (asl) returned non-zero return code (-6)
ERROR: Solver log: Bonmin 1.8.9 using Cbc 2.10.8 and Ipopt 3.12.13 malloc():
invalid size (unsorted) bonmin:
asl:bonmin:nl_version_2 Solver (asl) did not exit normally
ERROR: Solver (asl) returned non-zero return code (-6)
ERROR: Solver log: Couenne 0.5.8 -- an Open-Source solver for Mixed Integer
Nonlinear Optimization Mailing list: couenne@list.coin-or.org Instructions:
http://www.coin-or.org/Couenne corrupted size vs. prev_size couenne:
asl:couenne:nl_version_2 Solver (asl) did not exit normally

Information on your system

Pyomo version: 6.4.4 / 6.7.1 Python version: 3.11.6 Operating system: Red Hat Enterprise Linux release 8.8 (Ootpa) How Pyomo was installed (PyPI, conda, source): PyPI Solver (if applicable): see above, all solvers were downloaded from the AMPL Portal (https://portal.ampl.com/user/ampl/download)

grahamsparrow-8451 commented 3 months ago

@jsiirola Thanks for fixing this so quickly. I have been retesting and the original problem is resolved, but when I run my full set of unit tests, I still get some failures on the latest main. I don't know whether it is related to the NL Writer or not, but in order to investigate, I was trying to compare .nl files between 6.4.4 (old NL Writer which passes my tests) and latest main (using the new NL Writer with your fix). I am seeing that the files are very different, so not easy to compare for potential issues.

One thing I notice is that two nl file headers are identical except for the common expressions.

before:

 0 0 0 0 0  # common exprs: b,c,o,c1,o1

after:

 26 0 0 12 6    # common exprs: b,c,o,c1,o1

Do you have any tips in comparing these files so that I can diagnose the issue an potentially create a small problem that replicates my issue?

Thanks in advance!

jsiirola commented 3 months ago

One of the improvements in the NLv2 was support for Expression components. NLv2 will output these as AMPL "defined variables" (what is reported as "common exprs" in the header). This can be significantly more efficient - especially when the same Expression is used in multiple places - as the ASL will cache the function / Jacobian / Hessian evaluations and not re-evaluate them unnecessarily. The original NLv1 writer would just substitute the Expression into the objectives / constraints.

You can recover the old behavior by passing export_defined_variables=False as part of the call to solve().

Apart from that, reading NL files is challenging. The only documentation for the format is https://ampl.github.io/nlwrite.pdf, and it is not entirely complete. If you pass symbolic_solver_labels=True to the solve(), we will annotate the NL file with additional information that will hopefully help humans to parse the file (this works for both the v1 and v2 writers).

grahamsparrow-8451 commented 3 months ago

Thanks for your pointers, I think the export_defined_variables=False will be very helpful. And the annotation is also very helpful in making sense of the rather cryptic NL format!

jsiirola commented 3 months ago

I should have also said: please pass on / share inconsistencies as you encounter them. We are actually in the middle of working on the writer (the new writer has support for a basic presolve as well as suffix-based problem scaling, and we are prototyping some new solver interfaces that can make use of that functionality), so this is really timely.

grahamsparrow-8451 commented 3 months ago

Thanks for your work on continuing to improve Pyomo, it is great to see new capabilities and performance improvements. I will let you know what I find. If it turns out the issues I am currently seeing are not anything to do with Pyomo, I will update this issue, otherwise I will raise a new issue.

grahamsparrow-8451 commented 3 months ago

I have an issue, and I believe it is related to the new version of the NL writer, but I am a few steps away from narrowing it down and unfortunately, it is using the mindtpy contributed MINLP solver which (I believe) only indirectly uses the NL Writer, for interfacing to ipopt and cbc in my setup.

My situation is that

using Pyomo 6.4.4, it succeeds using the original NL Writer,
using 6.4.4 with NL Writer V2, it fails
and it also fails in the same way in newer versions (including latest main).

The symptom is that it fails to satisfy a non-linear constraint, but returns optimal status. It does not completely ignore the constraint, as when I change the bound the solution changes. It is as if the upper bound that the solver sees is higher by some margin than the true bound.

I will continue to investigate. Any pointers welcome :-)

jsiirola commented 3 months ago

This will be tricky. As a starting point, can you confirm that the test passes with mindtyp from the current main branch, and the default NL writer set back to nlv1?

grahamsparrow-8451 commented 3 months ago

I can confirm that the test passes with v1 on the latest main branch (but fails with v2). I will see if I can create a reasonably simple test case.

(edit) However, I also tried v2 with export_defined_variables set to False, and this succeeds.

So, it seems that it is related to the use of expressions. And this also fits with similar behavior I observed with couenne.

I was less confident about the couenne example previously, as the same generated .nl file works fine with bonmin so I was thinking this may be a couenne issue, however maybe it is a subtle issue with export_defined_variables. I will raise the couenne issue, as I can create a small problem that replicates that (which is not currently the case for the mindtpy problem)

Pyomo / pyomo