opencobra / optlang

optlang - sympy based mathematical programming language
http://optlang.readthedocs.org/
Apache License 2.0
253 stars 52 forks source link

symengine causes a linear constraint to be recognized as a quadratic one, causing an error #169

Open Midnighter opened 6 years ago

Midnighter commented 6 years ago

For future reference: With symengine installed the constraints added by the following function are evaluated to be quadratic, raising an error, without symengine they work perfectly. A sample error message:

NotImplementedError: Quadratic constraints (like forward_pos_2DHGLCK: 0.0 - dist_2DHGLCK - 100000.0*(1 - direction_2DHGLCK) - (1.0*2DHGLCK - 1.0*2DHGLCK_reverse_2bd8f) <= -0.72) are not supported yet.
def adjust_fluxes2model(model, observations, uncertainties=None, linear=True,
                        big_m=1E05):
    """
    Minimize the distance to observed fluxes accounting for multiple directions.

    If your observations include uncertainties the objective function, i.e.,
    minimizing the distance to the observations, is weighted by the inverse
    of the uncertainties.

    Parameters
    ----------
    model : cobra.Model
        The metabolic model
    observations : pandas.Series
        The observed fluxes. The index should contain reaction identifiers.
    uncertainties : pandas.Series, optional
        The uncertainties of the individual measurements, e.g., standard
        error. The index of the series should correspond at least partially
        to the ``observations``.
    linear : bool, optional
        Whether to minimize the linear or quadratic distance.
    big_m : float, optional
        Big M method value. This is used to resolve greater than inequalities
        and should be an adequately large number.

    Returns
    -------
    cobra.Solution

    """
    flux_col = "flux"
    weight_col = "weight"
    if uncertainties is None:
        data = observations.to_frame().join(Series([], name=weight_col))
    else:
        uncertainties.name = weight_col
        data = observations.to_frame().join(uncertainties)
    data.columns = [flux_col, weight_col]
    # replace missing and zero values
    data.loc[data[weight_col].isnull() | isinf(data[weight_col]) |
             (data[weight_col] == 0), weight_col] = 1
    prob = model.problem
    to_add = list()
    new_obj = list()
    with model:
        model.objective = prob.Objective(prob.Zero, direction="min", sloppy=True)
        for rxn_id, flux, weight in data[[flux_col, weight_col]].itertuples():
            try:
                rxn = model.reactions.get_by_id(rxn_id)
            except KeyError:
                LOGGER.warning("'%s' not found in the model. Ignored.", rxn_id)
                continue
            direction = prob.Variable("direction_" + rxn_id, type="binary")
            dist = prob.Variable("dist_" + rxn_id)
            forward_pos = prob.Constraint(
                flux - rxn.flux_expression - big_m * (1 - direction) - dist,
                ub=0, name="forward_pos_" + rxn_id)
            forward_neg = prob.Constraint(
                rxn.flux_expression - flux - big_m * (1 - direction) - dist,
                ub=0, name="forward_neg_" + rxn_id)
            reverse_pos = prob.Constraint(
                (-flux) - rxn.flux_expression - big_m * direction - dist,
                ub=0, name="reverse_pos_" + rxn_id)
            reverse_neg = prob.Constraint(
                rxn.flux_expression - (-flux) - big_m * direction - dist,
                ub=0, name="reverse_neg_" + rxn_id)
            if linear:
                new_obj.append((dist, 1 / weight))
            else:
                raise NotImplementedError("Need to set quadratic coefficients.")
#                new_obj.append((dist * dist, 1 / weight))
            to_add.extend([direction, dist, forward_pos, forward_neg,
                           reverse_pos, reverse_neg])
        model.add_cons_vars(to_add)
        model.objective.set_linear_coefficients({v: f for v, f in new_obj})
        solution = model.optimize(raise_error=True)
    return solution
KristianJensen commented 6 years ago

Sympy does automatic expansions for certain types of nested expressions, e.g. coef * (var1 - var2). Symengine does not appear to do this at all, instead expressions are structured exactly as they are formulated. The problem can be easily fixed by uncommenting line 466 in optlang.interface. However this will have a negative impact on general performance. At the very least it should only be done when optlang._USING_SYMENGINE is true - since symengine is so fast it may actually not have a huge effect there.

@Midnighter If you have some cobrapy benchmarks, can you maybe check how this would impact performance with symengine?

Midnighter commented 6 years ago

I will gather some stats.