Auto-breakdowns - Githubissues

pgkirsch commented 4 years ago

A feature I discussed with @mtwichan.

Anecdotally, a lot of GP(kit) constraints take the form y > x1 + x2 + x3 + ... + xn (e.g. mass rollup, cost rollup, energy rollup etc.). What's more, those x1, x2 etc. are often themselves constrained by rollups.

I often want to see/understand/plot/fetch the values of the constituent parts that make up y, i.e. the values of x1...n.

To do this in a semi-automatic way, I currently access the (known) constraint's varkeys, with something like:

mass_rollup = constraints[4]
masses = mass_rollup.right.varkeys
some_plotting_function(sol, masses)

However, given that the pattern is relatively prevalent and that it would be nice to be able to visualize any arbitrary rollup, I think it could be useful (and not too tricky) to have a native breakdown generator built into GPkit that takes a model (or a solution!) and a variable (in this case y) as an input and returns the rollup.

I think there are ~three~ four possible tiers of functionality, maybe implementation stepping stones.

Bronze

Returns a list of varkeys given an input varkey

# for the constraint y >= x1 + x2 + x3
m.breakdown_search("y") --> [x1, x2, x3]

Silver

Returns a dict of lists given an input varkey

# y >= x1 + x2 + x3
# x1 >= v1 + v2
# x2 >= v3
# x3 >= b*t**3 + a*t**2
m.breakdown_search("y") --> {x1: [v1, v2], x2: [v3], x3: x3}

Gold

Returns a nested dict of dicts for a SolutionArray given an input varkey. This is particularly appealing because I often want to dig into a solution without needing to re-generate the model object from which to derive the rollup (which breaks if the model has rollup has diverged from the rollup used to generate the solution).

# y >= x1 + x2 + x3
# x1 >= v1 + v2           # v2 has optimal value 5
# v1 >= u2 + u2           # u1 has optimal value 1, u2 has optimal value 2
# x2 >= v3 + v4           # v3 has optimal value 2, v4 has optimal value 1
# x3 >= b*t**3 + a*t**2   # x3 has optimal value 10
sol.breakdown("y") --> {x1: {v1: {u1: 1, u2: 2}, v2: 5},
                        x2: {v3: 2, v4: 1},
                        x3: 10}

Platinum

Returns a JSON string for creating an icicle plot (the D3 syntax is just one option -- I'm biased because it's the one I use)

# y >= x1 + x2 + x3
# x1 >= v1 + v2           # v2 has optimal value 5
# v1 >= u2 + u2           # u1 has optimal value 1, u2 has optimal value 2
# x2 >= v3 + v4           # v3 has optimal value 2, v4 has optimal value 1
# x3 >= b*t**3 + a*t**2   # x3 has optimal value 10
sol.breakdown("y", units="MUSD") -->
"""
{"name": "y",                                                                    
 "children": [                                                                   
  {"name": "x1",                                                                 
   "children": [                                                                 
    {"name": "v1":                                                               
     "children": [                                                               
      {"name": "u1",                                                             
       "size": 1,                                                                
       "label": "1 MUSD",                                                        
      },                                                                         
      {"name": "u2",                                                             
       "size": 2,                                                                
       "label": "0.002 MUSD",                                                    
      },                                                                         
    },                                                                           
    {"name": v2"                                                                 
     "size": 5,                                                                  
     "label": "5 MUSD",                                                          
    },                                                                           
   ]                                                                             
  },                                                                             
  {"name": x2",                                                                  
   "children": [                                                                 
    {"name": "v3",                                                               
     "size": 2,                                                                  
     "label": "2 MUSD",                                                          
    },                                                                           
    {"name": v4,                                                                 
     "size": 1,                                                                  
     "label": "1 MUSD",                                                          
    },                                                                           
   ],                                                                            
  },                                                                             
  {"name": "x3",                                                                 
   "size": 10,                                                                   
   "label": "0.01 MUSD",                                                         
  },                                                                             
 ]                                                                               
} # I'm sure I made some mistake but you get the point
"""

@bqpd what do you think about these various proposals?

bqpd commented 4 years ago

is a roll-up defined by being single variables (exponent of 1, coefficient of 1) only? that seems vital on the greater-than side but I could see it being relaxed on the less-than (though you don't in your last example).

bqpd commented 4 years ago

if those constraints aren't tight should a "slack" term be included, or is it implied?

bqpd commented 4 years ago

also, how about having it be an attribute of y, e.g. y.breakdowns -> [list of constraints that have "y >= ..."] as well as a method of sol as above

pgkirsch commented 4 years ago

Love these questions/ideas!

Yes, I was wondering if it could be generalized to all posynomials, but ended up not including because I wasn't sure if there would be undesirable consequences (e.g. if I have to filter through every constraint that has y >= I maybe lose some of the benefit of the feature?). On the other hand, it would be useful, for example, with mass margins when instead of totalmass >= subsystem1mass + subsystem2mass + .. it's totalmass >= subsystem1mass*massmargin1 + subsystem2mass*massmargin2 + ....

I like the idea of a slack term!

+1 for it being an attribute of Variable as well! As long as that isn't too burdensome.

bqpd commented 4 years ago

haha none of this is any trouble to program, my only concern is making it an elegant feature!

bqpd commented 4 years ago

So the question of "which y >= constraints are appropriate for this" has me wondering 1) whether it should be a new type of constraintset BreakdownConstraintSet([y >= ...]) 2a) if we should only allow one breakdownconstraint per variable (as implicit in the tree diagram above) and err if the user tries to create a second, or 2b) we should allow multiple but the solutions will be in a sol["constraints"]["breakdowns"] dictionary by {key: [breakdown constraints]}

1ozturkbe commented 4 years ago

These are all cool ideas. I did want to point out that this does create a question about dealing with potential cyclical behavior in breakdowns, potentially creating infinite trees. Simple example here that shows up a lot : W >= sum(Ws), W_struct >= f*W + other... Whatever your final architecture is will need to recognize this possibility.

pgkirsch commented 4 years ago

@bqpd while I definitely don't want to burden you, I was more really talking about model build times and performance! :)

1) My preference is to not create a different type of constraint set. As you know, I like that GPkit models can look pretty close to lists of pure engineering relationships with as little extra stuff (python wizardry) as necessary (I already have to tell people to ignore the Tight(), Loose() constructs). I think this is a big part of making large models seem less scary.

2) I would lean towards the former but are those mutually exclusive? The latter would be helpful even if it only allowed one breakdown.

@1ozturkbe Yes good point. This could be an argument for only allowing breakdowns of c=1, exp=1? Alternatively, something that checks whether the variable has already been... "broken down".

whoburg commented 4 years ago

related to #507, #522

bqpd commented 4 years ago

@pgkirsch

I hear that argument, but debugging why your breakdown isn't working will be so much easier with that, because we can raise specific warnings about why something isn't a breakdown constraint even though it's expected to be. I'm all for non-wizardry, but generally think accessibility to a novice user is more important than accessibility to a novice onlooker...
I think that's my favorite option; if the user tries to create a tree with ambiguity, we can print a warning and provide a syntax to choose which breakdowns to use.

@1ozturkbe @pgkirsch yeah I think a naive stop-on-cycle plus keying breakdowns by the greater-than monomial would serve the intended purpose; we know elements in the lower levels have to be less-than-or-equal (that is, f has to be less than 1), so cycling would break the intention of a tree vis. (the reason for keying with monomials would be to avoid situations like W**2 >= v*w, w >= f*W, where W might be smaller than w, which would lead to a rather confusing tree in terms of W but still a sensible one in terms of W**2)

pgkirsch commented 4 years ago

@bqpd Ok. I tried to construct an argument for why this would add lots of "cruft" to models (based on the shear number of constraints in my models that match this pattern), but I can see the upside and I like the parallel with Tight/Loose, so I'm ok with it if needs to be an explicit user-declared classification. My only request is that we keep the class name short (e.g. Breakdown([])). We originally started with TightConstraintSet and it became Tight for the same reason.

The only major downside of explicit classification I can think of is not having backwards compatibility with older versions of the model that have fundamentally the same constraints but won't be able to identify/visualize breakdowns.

pgkirsch commented 4 years ago

Also, just so I understand, the intent is for the nesting of breakdowns to be implicit, right?

class Aircraft(Model):
    def setup(self):
        constraints = [
            Breakdown([aircraftmass >= wingmass + fuselagemass + ..]),
            Breakdown([aircraftcost >= wingcost + ...]),
            Breakdown([wingmass >= ...]),
            Breakdown([wingcost >= ...]),
            cruisemach >= 0.8, # some unrelated constraint
            Breakdown([fuselagemass >= ...]),
            ...
        ]
        return constraints

^will identify the appropriate nested breakdowns for aircraftmass and aircraftcost, right? Or are we not talking about nested breakdowns yet?

pgkirsch commented 4 years ago

Making that example more realistic and hopefully making my point clear:

class Aircraft(Model):
    def setup(self):
        self.wing = Wing()
        constraints = [
            Breakdown([aircraftmass >= self.wing.mass + fuselagemass + ..]),
            Breakdown([aircraftcost >= self.wing.cost + ...]),
            cruisemach >= 0.8, # some unrelated constraint
            Breakdown([fuselagemass >= ...]),
            self.wing,
            ...
        ]
        return constraints

class Wing(Model):
    def setup(self):
        constraints = [
            Breakdown([mass >= ...]),
            Breakdown([cost >= ...]),
        ]
        return constraints

bqpd commented 4 years ago

yup, nesting would still be implicit! Breakdown is a great name.

pgkirsch commented 4 years ago

I don't know how helpful this is but I just re-discovered some code I put together for this a while ago, so I figured I'd dump here. Somewhat embarrassing that I forgot about this...

def get_breakdown(var, model, sol, units=None):                                     
    breakdown = {}                                                               
    breakdown["name"] = var.name                                                 
    breakdown["label"] = "{0:.0f}".format(sol(var).to(units).magnitude)          
    if not units:                                                                
        units = var.units                                                        
    if var in model.substitutions:                                                  
        breakdown["size"] = sol(var).to(units).magnitude                         
    else:                                                                        
        for cstr in model.flat():                                               
            if hasattr(cstr.left, "key"):                                        
                if cstr.left.key is var:                                         
                    if len(cstr.right.varkeys) == len(cstr.right.exps):          
                        breakdown["children"] = []                               
                        for varkey in cstr.right.varkeys:                        
                            breakdown["children"] += [get_breakdown(varkey,      
                                                                    model, sol,     
                                                                    units)]      
                    else:                                                        
                        breakdown["size"] = sol(var).to(units).magnitude         
    return breakdown

I use this to generate jsons compatible with the D3 icicle plots with a few lines:

costbreakdown = get_breakdown(m.aircraftcost.key, m, sol, units="kUSD")   
with open("aircraftcost.json", "w") as f:                                             
    json.dump(costbreakdown, f, indent=2, sort_keys=True)

bqpd commented 1 year ago

Closed by https://github.com/convexengineering/gpkit/pull/1566

convexengineering / gpkit

Auto-breakdowns #1512

Bronze

Silver

Gold

Platinum