tBuLi / symfit

Symbolic Fitting; fitting as it should be.
http://symfit.readthedocs.org
MIT License
235 stars 17 forks source link

Wanted: examples of constrained fitting #78

Open tBuLi opened 8 years ago

tBuLi commented 8 years ago

Hi @Pitje06, @Jhsmit, @jbarnoud, and others,

I am looking for examples of fitting problems subject to constraints to write tests and examples with for the Constrained fitting milestone. If you know one from your daily experience or can think of one, can you post it here? As a simple example, I could imagine the following fit:

model = {y: a * x + b}
constraints = {Greater(a, b)}

fit = Fit(model, x=xdata, y=ydata, constraints=constraints)

But I would like some more realistic and challenging examples to really push the boundaries of what symfit can handle.

tBuLi commented 8 years ago

One example I have is from reaction kinetics. Suppose we have an equilibrium reaction between compound A and B with rate constants k_forward and k_reverse. Then these can be expressed in terms of activation energy through Arrhenius. But for the activation energy of the reaction, we might know that Ea_reverse > Ea_forward.

Therefore:

Ea_r, Ea_f, A_r, A_f = parameters('Ea_r, Ea_f, A_r, A_f')
R, T = 8.3145, 298

k_f = A_f * exp(- Ea_f / R * T)
k_r = A_r * exp(- Ea_r / R * T)

model_dict = {
    D(A, t): - k_f * A + k_r * B,
    D(B, t): k_f * A - k_r * B
}
constraints = {
    GreaterEq(Ea_r, Ea_f)
}

fit = Fit(model, ... , constraints=constraints)

This can then be extended to multiple equilibria.

pckroon commented 8 years ago

I only have one example so far (and you're not going to like it)

def boltzmann(T, V, *args, **kwargs):
    kbT = kb * T
    return exp(-V(*args, **kwargs)/kbT)

def periodic(x, x0, k, n):
    return k*(1+cos(n*x - x0))

distribution = functools.partial(boltzmann, T=298, V=periodic)
x = Variable()
y = Variable()
x0 = Parameter()
k = Parameter()
n = Parameter()
constraints = {Integer(n)}
model = Model({y: distribution(x=x, x0=x0, k=k, n=n)})
fit = Fit(model, ..., constraints=constraints)
tBuLi commented 8 years ago

Good one @Pitje06. You are right, I don't like it ;). Integer constraints in fitting are a pain. Do you have any ideas how to solve this? Any algorithms?

Naively I would do a float fit and then check wether floor or ceiling results in a smaller chi^2 and return that, though I can see some problems with this.

To adhere to standard nomenclature though, this is not a constraint but an assumption. And sympy already has an assumption syntax that I would like to use for this. In pure sympy:

n = Symbol('n', integer=True)
k, m = symbols('k m', integer=True)

print(n.is_integer) # Prints True
print(n._assumptions) # Contains all assumptions

In symfit we would have the equivalent syntax for Parameter and Variable.

So I think that assumptions are a separate problem, unless they can be expressed as a constraining function, which in the case of integers is not obvious to me. I can create a separate milestone for (integer) assumptions. General assumptions are another step entirely ;).

pckroon commented 8 years ago

Nope, no clue how to do this :) You are right that this might be more of an assumption than a constraint, but I'm not sure that's more than semantics. On a related topic, does symfit work in the complex domain? Is there any way to (assume|constrain) the parameters to the (Real|Complex|Integer) domain?

Jhsmit commented 8 years ago

I do a lot of double exponential fitting, in this case a use could be to constrain the decay constanstants to assure the first one is always the largest. However, its not a great example of pushing any boundaries. Also, in my experience 99/100 cases the first is always the largest.

a1, t1, a2, t2 = parameters('a1 t1 a2 t2 c')
t = Variable()

m = a1*exp(-t/t1) + a2*exp(-t/t2) + c

}
constraints = {
    Greater(t1, t2)
    # or Greater(a1, a2)
}

fit = Fit(model, ... , constraints=constraints)
tBuLi commented 8 years ago

Thanks @Jhsmit.

@Pitje06: Currently only reals are supported, since typically your fit will be performed using least_sq, which assumes real values. Thinking about it now, I think this can be extended by changing (f - y)^2 to |f - y|^2, where |f - y| denotes the magnitude of the complex residual. Should be plug and play.

I also saw the other day that scipy has complex_ode, which is good to have as well since those are quite common in physics. However, I'm boycotting this particular object because it was modeled on ode, which was designed by a madman. It's fine for a specific problem, but writing a general wrapper for it is literally impossible, which is why I switched to odeint. I think that in the future an ODE specific package needs to wrapped to make this feature way more powerful.

pckroon commented 8 years ago

Slightly off-topic: If you need an excuse reason not to use complex_ode: it has known, unfixed bugs. https://github.com/scipy/scipy/issues/1976 https://stackoverflow.com/questions/34577870/using-scipy-integrate-complex-ode-instead-of-scipy-integrate-ode

I'll make a separate issue concerning the parameter domains.