Scaling factor calculation order

eslickj commented 3 years ago

Generally scaling factors are calculated in the tree structure of an IDAES model by going from the leaf blocks and calling the calculate scaling factors methods and proceeding to their parents and back to the root model block. This means roughly going from the state and reaction blocks to the control volumes the the unit models to the flowsheet to the top level Pyomo model.

In some cases one may want to calculate some scaling factors that don't fit into that order. For example, in a reactor unit, some information need to calculate a scaling factor for reaction extent may be available at the unit model level but not in the control volume level even though reaction extent is a quantity defined in the control volume and may be used to scale other things in the control volume.

I have several ideas about how this could be supported.

1) Allow callbacks to be supplied to calculate scaling factors in control volumes that allow some calculations to be done before the standard ones based on things in other model blocks that come later in the calculation order. In the example above a callback could be supplied to calculate the extent of reaction scale factor before proceeding with the rest of the calculations. 2) Subclass control volume for use in unit models that require out of order scaling factor calculations to override the default calculate scaling factors method. 3) Flag scaling factors as being calculated so they can be differentiated from user-supplied scaling factors. This would allow the unit model to call the control volume scaling factor calculation method again and recalculate the calculated scaling factors after doing some unit model level calculations.

Robbybp commented 3 years ago

My approach here is to define a "dependency DAG" of scaling factors based on the expressions used to calculate them, then calculate scaling factors in a topological order of that DAG. I think there is an underlying point here, though, that while IDAES's block-hierarchical structure is very convenient for abstraction and model construction, it is not always the most convenient for, e.g., initialization.

Robbybp commented 3 years ago

Did not mean to close. I'll be thinking about this more as I try to implement my scaling strategies on the CLC unit model classes.

eslickj commented 3 years ago

I think one issue is that there isn't much prescribed structure to the scaling factor calculations other than they are calculated from the leaves of block structure up. Since you can really do whatever you want in general it may be hard to build a dependency graph. We could write a second Pyomo problem to calculate scaling factors. The scaling factors would be variables and they would either be fixed or have an associated constraint. It wouldn't need to be a square problem necessarily. We could have some auto-scaling objective function options maybe. At the very least that would solve the dependency issues. It's not particularly hard to implement, and the resulting problem should be easy to solve. I suggested this before as joke, but maybe it's not a bad idea.

The problem would have one variable for each variable, constraint, and named expression that we want to scale. User specified scaling factors are fixed. The current expressions for calculating scaling factors would just get converted to constraints.

Andresj89 commented 3 years ago

@eslickj I think this is a fantastic idea, it would make our life so much easier, what objective function(s) would you suggest?

eslickj commented 3 years ago

@Andresj89 we don't necessarily need objective functions. Reproducing what we do now gives you a square problem. I do have some ideas for objectives, but they may not be good. One idea is to take the Jacobian at whatever point. Then optimize the scale factors to get the entries as close to 1 as possible. We could look into it more.

eslickj commented 3 years ago

@andrewlee94, @Robbybp, @jsiirola, and @carldlaird, Here's a prototype of and all new scaling system. It's pretty simple and supports out-of-order calculation. It opens the door to using optimization to calculate scaling factors. The code below is pretty much all it takes with a little polish. This is both simpler and more flexible than what we have. Unless you really get crazy the the complexity of the constraints used to calculate scaling factors, I'm pretty sure the scaling factor problem will readily solve.

import idaes
import pytest
import types
import pyomo.environ as pyo
from pyomo.common.collections import ComponentMap

def mirror_model(m, mm, cmap):
    for b in m.component_objects(pyo.Block, descend_into=False):
        _c = pyo.Block(b.index_set())
        setattr(mm, b.getname(), _c)
        cmap[b] = _c
        for i, bs in b.items():
            cmap[bs] = _c[i]
            mirror_model(bs, _c[i], cmap)
    for c in m.component_objects(
        (pyo.Var, pyo.Expression, pyo.Constraint), descend_into=False):
        _c = pyo.Var(c.index_set(), initialize=1)
        setattr(mm, c.getname(), _c)
        cmap[c] = _c
        for i, cs in c.items():
            cmap[cs] = _c[i]

def calculate_scale_factors(m):
    mm = pyo.ConcreteModel()
    cmap = ComponentMap()
    mirror_model(m, mm, cmap=cmap)
    for b in m.component_data_objects(pyo.Block, descend_into=True):
        if hasattr(b, "calculate_scale_factors"):
            b.calculate_scale_factors(cmap)
        if hasattr(b, "fixed_scale_factors"):
            for c, v in b.fixed_scale_factors.items():
                cmap[c].fix(v)
    solver = pyo.SolverFactory("ipopt")
    solver.solve(mm, tee=True)
    return mm, cmap

def set_scale_factor(c, v):
    m = c.parent_block()
    if not hasattr(m, "fixed_scale_factors"):
        m.fixed_scale_factors = pyo.Suffix()
    m.fixed_scale_factors[c] = v

def get_scale_factor(c, cmap):
    m = c.parent_block()
    if hasattr(m, "fixed_scale_factors"): # fixed
        if c in m.fixed_scale_factors:
            return pyo.value(cmap[c])
    elif not cmap[c].stale: # calclated
        return pyo.value(cmap[c])
    else: # not fixed or calculated
        return None

if __name__ == "__main__":
    #
    # Test it
    #
    m = pyo.ConcreteModel()
    @m.Block([1,2,3,4])
    def b1(b, i):
        b.b1a = pyo.Block()
        b.b1a.b1aa = pyo.Block()
        b.b1a.x = pyo.Var()
        b.b1a.e = pyo.Expression(expr=b.b1a.x*2)
    m.b2 = pyo.Block()
    @m.b2.Constraint([1,2,3,4])
    def c1(b, i):
        return m.b1[i].b1a.x == 1

    # wouldn't need to do this in IDAES, but add a calculate_scale_factors
    # method here, it would be where it is now.  This would be part of the
    # block data class
    def b2csf(self, cmap):
        blk = cmap[self]
        @blk.Constraint(self.c1.index_set())
        def c1_scale_constraint(b, i):
            return cmap[self.c1[i]] == cmap[m.b1[i].b1a.x]*5
    m.b2.calculate_scale_factors = types.MethodType(b2csf, m.b2)

    # Set scale factors
    for b in m.b1.values():
        set_scale_factor(b.b1a.x, 10)

    mm, cmap = calculate_scale_factors(m)

    # get some scale factors
    # a set one
    assert get_scale_factor(m.b1[1].b1a.x, cmap) == pytest.approx(10, rel=1e-5)
    # a calculated one
    assert get_scale_factor(m.b2.c1[1], cmap) == pytest.approx(50, rel=1e-5)
    # a not set one
    assert get_scale_factor(m.b1[1].b1a.e, cmap) is None

    for c in [m.b1[1].b1a.x, m.b2.c1[1], m.b1[1].b1a.e]:
        print(f"Scale factor for {c} is {get_scale_factor(c, cmap)}")

andrewlee94 commented 3 years ago

I think I see where this is going, but I have a lot of questions still.

What are the constraints on the mirror model, and how are they constructed? From the code above, I am guessing that each model would need to provide a set of rules/relationships to be used to construct these constraints via the calculate_scale_factors method.
What exactly are we trying to solve for? Is it to get a meaningful residual in every constraint? The example above would tend to imply that we are building a simple system where scaling factors must be equated across the system, which leads to the next question.
What are the degrees of freedom in the problem? I.e how many initial scaling factors does the user need to provide? What if they want to provide more than necessary to satisfy the problem (i.e. they have good initial knowledge) - how do we deal with too few degrees of freedom? This could be really important for NLP problems where we might want to pick scaling factors that do not necessary "match", but provide better scaling over the expected range of solutions.
If we go to "optimizing" scaling factors, what is the objective and how do we ensure a meaningful solution? For an NLP I could imagine cases where there could be an "optimal" solution with a great objective value which is only meaningful in the immediate area around the current (initial) solution.

eslickj commented 3 years ago

@andrewlee94.

Yes from the calculate_scale_factors method, basically the expressions we have now just become constraints.
If you want a square problem, all the known scaling factors by the user are fixed. There are constraints available for some scaling factors. Although not in the example, we can add logic that says if a scale factor is in the fixed list, we don't add it's constraint. The bad thing is that if you end up with too few degrees of freedom, if you didn't fix a scale factor you need. We can change the cmap mapping in the example into a function that can also keep track of which scaling factors are required, and tell you exactly what you need. Yes, the first goal of constraint scaling is to get a meaningful residual relative to the solver tolerance.
I think we can do something like we do now. If a scaling factor is used in a scaling factor constraint, but not provided, we can fix it at 1 or some other default value, and warn the user. We can know what's needed because everything in the constraints has to go through cmap and we can use that to keep track.
The first implementation would always result in a square problem. Using optimization is future work I guess.

eslickj commented 3 years ago

To follow up on my response to @andrewlee94's question, hopefully a little more clearly, scaling factors are either fixed, have a constraint, or are not set. If a scaling factor is fixed and has a constraint, the constraint can be dropped. If a scaling factor that is not set is in a constraint we can fix it at some default value and warn. You end up with a square problem equivalent to what we do currently. It's just simpler to implement and doesn't depend on calculation order or the block structure.

A future direction could be to use some as-yet-undetermined objective function to calculate scaling factors. We'll be set up for it, but I don't really know a good method yet.

andrewlee94 commented 3 years ago

One other thought that crossed my mind whilst thinking about this is how well it would work with very large (e.g. dynamic) models? Seeing as this creates a second model with matching variables and constraints, there could be memory issues associated with this when dealing with big problems.

eslickj commented 3 years ago

@andrewlee94, yeah I was a little worried about that. The scaling factor problem only solves once, and we can save the results. If it does turn out to be an issue, we could probably devise something that would let us solve it on one time step. That does add complexity though.

andrewlee94 commented 1 year ago

This is old enough that I think it can be closed now. Plus, we are starting to look at rebuilding scaling anyway,

IDAES / idaes-pse

Scaling factor calculation order #263