Allow interdependence of components

tBuLi commented 5 years ago

Consider the following matrix equation:

from symfit import symbols, Parameter, MatrixSymbol, Inverse, Fit

N = symbols('N', integer=True)
M = MatrixSymbol('M', N, N)
I = MatrixSymbol('I', N, N)
y = MatrixSymbol('y', N, 1)
c = MatrixSymbol('c', N, 1)
a = Parameter('a')

model_dict = {c: - Inverse(I + M / a**2) * y}

The quantity I would like to fit is the scalar z = c.T * c = d**2, where d is some experimental value. Here is the issue: the value c contains a matrix inversion, which is expensive to compute, and in z this is performed twice. However, the usual solution of substituting W = Inverse(I + M / a**2) and calculating that separately is not available to us, because it is a function of the parameter a.

The most natural thing to do would be

model_dict = {
    W: Inverse(I + M / a**2),
    c: - W * y,
    z: c.T * c
}
model = Model(model_dict)
fit = Fit(model_dict, z=d**2, I=..., M=..., y=...)

This will not require any changes to Fit, because we already allow this for ODEModels. There we also sometimes have dummy components with no data, and they are ignored. In the model above, since no data is provided for the components c and W, they too will be ignored by Fit.

The problems are that Model's signature can not be generated, because of the double roles played by c and W as both dependent and independent variables, and when evaluating __call__ the components will have to be calculated in the correct order and substituted.

Analogously to ODEModel, I think c and W should be considered as dependent variables.

Then, the lambda functions per component should somehow be linked, preferably without destroying backwards compatibility to code which is calling numerical_components directly instead of through __call__.

pckroon commented 5 years ago

Is there always a unique way of doing the substitutions? How can we detect circular substitutions? Plus, if we want to get the computational advantage we need to be super clever in evaluating and caching. Because if dependent variables don't depend on parameters we should only calculate them once.

Step 1) Get values for all parameters (trivial, from call) Step 2) Get values for all independent variables (trivial, from data; possibly from evaluating parameter independent components of the model -- store those) Step 3) Get values for all intermediate dependent variables in the right order. Step 4) Get values for the final dependent variables of the model (trivial)

We know what variables we start with, and we should be able to figure out what variables we end with (everything for which we're given data). My current take would be to make a directed graph of variable dependencies (fairly easy; you wouldn't need to specify start and end points.) and try to linearize that (fairly hard I expect [1]). If that breaks your variables have a cyclic dependency. The downside is that we either 1) implement our own graph logic, or 2) get a dependency on networkx.

[1] https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.dag.topological_sort.html#networkx.algorithms.dag.topological_sort

pckroon commented 5 years ago

It might be worth to try an implementation that uses networkx, and if that works see how much functionality we'd need to reproduce.

tBuLi commented 5 years ago

Those are some good points to think about, especially the cyclic dependencies. But I'm not sure that those systems can even be built, intuitively I think that such a system would be singular?

tBuLi commented 5 years ago

After some more experimentation, I now have a working version for CallableModel's. However, for Model all hell breaks loose because the jacobian can not be determined since sympy does not know about these dependencies! That means that the above example will have to be changed to something like:

N = symbols('N', integer=True)
M = MatrixSymbol('M', N, N)
I = MatrixSymbol('I', N, N)
y = MatrixSymbol('y', N, 1)
c = Function('c')
W = Function('W')
a = Parameter('a')

model_dict = {
    W(I, M, a): Inverse(I + M / a**2),
    c(W, y): - W(I, M, a) * y,
    z: c(W, y).T * c(W, y)
}
model = Model(model_dict)
fit = Fit(model_dict, z=d**2, I=..., M=..., y=...)

No idea if this is valid syntax though, especially because I'm mixing MatrixExpr with normal Expr. Probably a lot of errors will follow.

pckroon commented 5 years ago

Those are some good points to think about, especially the cyclic dependencies. But I'm not sure that those systems can even be built, intuitively I think that such a system would be singular?

Could be, depends on what exact terminology you use. Some examples of what I would call cyclic systems: {a: a}, {a: 2*b, b: 2*a}, {a: f(b), b: g(a)}. Of course for linear systems we can detect when they're still valid (especially if I let you unleash some LinAlg), but we can't for the general non-linear case (which is also where the term 'singular' is no longer valid, I think).

Models will need a call_order attribute (or something) that describes in what order components should be evaluated; and if you assume models don't change this can be cached. __call__ and the eval_jacobian methods would use this to evaluate the components of the model in order, and update the arguments for the next component with the outcome.

Maybe SymPy can do this, I don't know. Do bear in mind it has to work for the general non-linear case. It seems to be quite good at telling us where a system of equations is 0, but evaluating seems less obvious.

See https://stackoverflow.com/questions/26788576/how-to-evaluate-system-of-equations-in-sympy and https://stackoverflow.com/questions/29244523/cant-get-sympy-to-numerically-evaluate-results-of-solving-a-system-of-equations

tBuLi commented 5 years ago

To your remark about cyclic systems, I'm still not sure how realistic those are. For ODEs a similar thing can happen, but in D(y, x): y the derivative and the variable are linearly independent. For those other cyclic systems I don't even know what that would mean. Or do you mean the previous evaluation, like

a_k = 2*b_{k - 1}
b_k = 2*a_{k - 1})

I was thinking to have a connectivity_mapping property on the models, which is a dict containing the model_dicts keys as keys, and a set of vars and params as values. These vars might themselves also feature as keys of the dict, if they are interdependent. Only vars which only appear in the keys are considered as dependent, and vars that only appear in the values are considered independent. In the example above:

>>> model_dict = {
>>>     W: Inverse(I + M / a**2),
>>>     c: - W * y,
>>>     z: c.T * c
>>> }
>>> model = Model(model_dict)
>>> print(model.connectivity_mapping)
{W: {I, M, a}, c: {W, y}, z: {c}}

This is essentially the connectivity matrix, but represented as a mapping which is more practical.

pckroon commented 5 years ago

Exactly my idea. A dictionary of neighbours is indeed a completely equivalent representation of a (directed) graph, and often the more convenient one. If we do this we can directly borrow the topological sorting code from networkx, which has a very similar data structure.

tBuLi commented 5 years ago

Fixed by #201

tBuLi / symfit

Allow interdependence of components #194