Managing unknown components in writers and transformations

qtothec commented 6 years ago

In a discussion with @jsiirola about other issues, we stumbled across the following:

An open discussion exists for how writers and transformations should deal with unknown components. This includes things like components that users or extensions may declare (i.e. outside of pyomo.core).

For components that can be activated/deactivated, e.g. Disjunction, the writer should raise an error when encountering unknown components in a model that are still marked active. For components that cannot be deactivated, the approach is less clear. A Connector is simply a grouping of variables which makes little sense to deactivate (and as such is not an ActiveComponent). But that also means that it cannot be deactivated to indicate to the writer that it is safe to ignore the unrecognized component. In this case, it should always be safe for writers to silently ignore Connectors.

PS @jsiirola wrote part of the above.

jsiirola commented 6 years ago

Some questions:

Is it true that a writer should always be able to ignore Components that do not inherit from ActiveComponent? The rationale is that if the component affected how the solver behaves, then it should also be possible to remove the component from the model (deactivate it).
What about Var? It is not an ActiveComponent; however, it's presence can impact the model. For example, should a Var that is not used in any Constraint or Objective, but with infeasible bounds cause a model to become infeasible? (see #308).
- If the answer to the above is "yes", do things change when the Var in question is on a deactivated Block?

ghackebeil commented 6 years ago

I like to think of Blocks as special umbrella components whose active status applies to all components under them in their storage tree (depending on where you "view" those components from). As an example, In the kernel, the default behavior for the blocks when activating/deactivating is to to not touch the active status on anything below them but just its own flag (this can be overridden with a keyword). I.e., the active status of variables is inferred from its ancestor blocks and which block in the tree you are asking for "active" variables on.

Good point about the infeasible bounds for variables that are not used anywhere (but are on an active block). Assuming we stick with a well defined method of pre-collecting all potentially used variables (e.g., iterating over active Block ctypes), maybe we could add a check like this to the OptSolver base class, and, in the case of in-feasibility, return a results object with that status. Unfortunately, making something like this robust would involve the use of tolerances, which is not good.

I feel like this is motivation for sending even the unused "active" variables to the solver, but this conflicts with the design intention of the "stale" flag.

qtothec commented 6 years ago

The slight issue with only treating vars as active if they are in an active block is that it then becomes possible to write a constraint in an active block containing a variable in a deactivated block. How should the variable be treated in this case?

ghackebeil commented 6 years ago

I would say that is an invalid model and we should raise a helpful exception, but I know there is currently some disagreement on that front. Whatever the case, the scenario you describe violates my precondition of there being "a well defined method of pre-collecting all potentially used variables" (e.g., collecting variables on all active blocks).

blnicho commented 6 years ago

No, writers should not always ignore Components that do not inherit from ActiveComponent. I can illustrate this with two examples: ContinuousSet and DerivativeVar. These are essentially identical to Pyomo Set and Var components however it doesn't make sense for the writers to treat them as such unless a transformation has been applied to the model. Instead of telling the writer to ignore them, we play games with the ctype.

I suppose it is true that if a model contains these components and they aren't used elsewhere in the model then it is technically safe to ignore them but it seems a lot more challenging to check if a component is used (e.g. as an index or in a Constraint) than to check for existence on the model.

jsiirola commented 6 years ago

Here is an argument for allowing variables on deactivated blocks. Consider a Disjunction containing two disjuncts m.A and m.B. There are two binary (Boolean) indicator variables (m.A.indicator_var and m.B.indicator_var). When we relax the Disjunction, we create two *new* (active) Blocks (for simplicity let's call themm.A1andm.B1) and copy over all the active contents ofm.Aandm.B, replacing variables in any Constraints we encounter with disaggregated variables. We then deactivate the original Disjuncts so that the writers don't complain about an unrecognized ctype. The problem arises with the disaggregation variables: there are constraints on the disaggregated variables that reference the originalindicator_var`s (which are now sitting on deactivated blocks).

Now, we could move the indicator variables along with the Constraints when we construct the new Blocks, but that gets messy quickly. First, it is confusing to the user: when a solution comes back from the solver, variables that the user thought they declared are no longer there! Making matters worse, the user would have to understand the details on how the disjunctions were relaxed in order to be able to find the new location of the indicator variables. Finally, there is nothing (mathematically) that prevents a single Disjunct from participating in multiple disjunctions. If relaxing one disjunction causes the indicator var to move, what does the other disjunction have to do?

Another example is the application of a Basic Step. Here we are basically taking two disjunctions and forming a single disjunction that is the cross product of the two sets of disjuncts. For example, taking a disjunction of (a, b, c) and another of (d, e) and forming a single disjunction with new disjuncts (ad, bd, cd, ae, be, ce). Obviously, the new disjuncts will all get new indicator variables. However, we will still need to form constraints that link the new indicator variables to the original ones (as there may be additional model logic imposed on the indicator_vars), and the original variables need "homes", with the original disjunct being the obvious location.

jsiirola commented 6 years ago

@blnicho: with (1), I want to draw a distinction between an error triggered because an unknown ctype is present on the model, and an error because an unknown ctype is envountered when the writer is processing a known Component (like a Constraint). In that light, I don't see a problem with either ContinuousSet or DerivativeVar: the writer shouldn't complain just because those components are on the model, but it should complain (loudly) it it encounters a DerivativeVar when it is processing a Constraint.

gseastream commented 6 years ago

This has been a topic we've brought up a lot, but with the GAMS ctype checking PR now merged, do we want to consider whether or not this is the accepted model of how writers should handle ctypes? Here's a summary of what the GAMS writer now does:

First of all, there are now two importable sets of ctypes in pyomo.repn.util:

valid_expr_ctypes_minlp = {Var, Param, Expression, Objective}
valid_active_ctypes_minlp = {Block, Constraint, Objective, Suffix}

The GAMS writer will check for all active ctypes on the model first, then it will error if any of those that are subclasses of ActiveComponent are not part of the valid active ctypes minlp set above
The writer will check the ctype of every component it finds in active constraints/the objective, and will error if those are not part of the valid expr ctypes set above
Additionally, the writer will make sure that every component appearing in an active constraint or expression is on the same model tree as the model passed to write.

qtothec commented 6 years ago

What are the competing views on ctype handling that currently exist?

gseastream commented 6 years ago

I don't know where the others currently stand, but I know we've talked about the principles of components being activatable, or what to do about variables on deactivated blocks, if we should collect all variables before walking the constraints, what to do about unused variables with infeasible bounds, etc.

@jsiirola @ghackebeil @carldlaird @michaelbynum

mrmundt commented 6 months ago

@jsiirola - With the recent rewrites of the writers, was this addressed / what more needs to be done?

jsiirola commented 6 months ago

As of 6.7.0, this has been addressed in the LP and NL writers (following from #1032). We still need to update the BAR, GAMS, and APPSI writers for consistency.

Pyomo / pyomo

Managing unknown components in writers and transformations #309