Open whart222 opened 5 years ago
I'm proposing this as a PEP because I'm not inclined to revisit this logic alone. I think the most productive path forward would be to review the COEK expression tree walker and leverage ideas developed for it.
Also, I think we should agree that these revisions are a high enough priority to make this a focus for Pyomo development. For example, we've discussed alternative strategies for accelerating this element of the Pyomo core.
I have been thinking along similar lines. I have motivation to look into this, as repn generation and lp/nl writes appear to be the performance bottleneck at the moment. I have been thinking about using a "bottom-up" walking strategy similar to what we use in evaluation and expression replacement - with the same goal (reduce the creation and destruction of intermediate objects).
I have a prototype visitor-based version of generate_standard_repn with competitive performance vs the current implementation available on my standard_repn_rework branch.
Archived on the master Performance Proposals Issue (#1430). Closing this performance proposal until active development has begun.
Re-opening as this was marked as a PEP. That said, I would vote to "unmarking" this as a PEP until we are ready to put together a formal proposal.
My proposal is to mimic the design of canonicalization in COEK, which I think creates many fewer temporary objects.
While rethinking the design of COEK expressions, I've realized that we can probably accelerate the generation of canonical representations in Pyomo.
Currently, the _collect* functions used in pyomo/repn/standard_repn.py return a "Results" object. (Actually, there are several, in an attempt to minimize the creation of unused data.)
The 'ast' branch of COEK uses a similar walker strategy, but the visit(*) functions accept a Results object, and the semantics are that the visiting function "adds" terms to the Results object.
This has two positive effects:
WRT (2), I think that Pyomo exhibits quadratic behavior when processing an expression like:
a - (b - (c - (d - (e ... - z) ... )))
What happens, is that the RHS objects are created, passed back, and then pulled into the Results object in the parent calling function. The proposed new logic avoids this.