opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
465 stars 218 forks source link

try lowering the integer_threshold #941

Closed PkiwiBird closed 3 years ago

PkiwiBird commented 4 years ago

I'm using gapfill function after applying FASTCORE. On some models gives the error:

'failed to validate gapfilled model, 
'try lowering the integer_threshold'

I opened the gapfilling script and changed the value to values below 1e-6. But got same error even after changing to 1e-100. So, then I removed from the script the following:

if not self.validate(solution):
                raise RuntimeError('failed to validate gapfilled model, '
                                   'try lowering the integer_threshold')

It worked fine, but after adding the gapfill suggested reactions to the model the model was still infeasible model.optimize().status was 'infeasible'. This is weird cause the complete universal model is feasible. Follows the complete model and incomplete model before gapfilling. Could you please help?

completeModel.zip IncompleteModel.zip

I'm using demand reactions false: essential = gapfill(IncompleteModel, completeModel, demand_reactions=False)[0]

Midnighter commented 4 years ago

I opened the gapfilling script and changed the value to values below 1e-6. But got same error even after changing to 1e-100.

Can you describe how exactly you did that? Typically, solvers have a limit at 10^-9 or 10^-10. Otherwise @cdiener might be able to provide more insights.

cdiener commented 4 years ago

Hmm, hard to tell without seeing more of what was done. Could be number of things. The error means that it can't gapfill (no addition of reactions allows flux through the objective).

Did you impose a lower bound on your objective or any other flux that prohibits it from being zero? The infeasibility error suggest you may and that may be a problem for gapfilling as well. Does the universal model have a feasible solution when you block demand reactions?

gregmedlock commented 4 years ago

Also check for duplicate reaction IDs across the universal and model—duplicate IDs for reactions with different bounds can lead to this issue (although should have raised warnings along the way).

On Tue, Mar 3, 2020 at 12:56 PM Christian Diener notifications@github.com wrote:

Hmm, hard to tell without seeing more of what was done. Could be number of things. The error means that it can't gapfill (no addition of reactions allows flux through the objective).

Did you impose a lower bound on your objective or any other flux that prohibits it from being zero? The infeasibility error suggest you may and that may be a problem for gapfilling as well. Does the universal model have a feasible solution when you block demand reactions?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/opencobra/cobrapy/issues/941?email_source=notifications&email_token=ABRXYGKJZSZFTR2FOBZ5RZLRFVAEZA5CNFSM4LALKELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENUP5UY#issuecomment-594083539, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRXYGNRDSHTRC7Z45KAXJDRFVAEZANCNFSM4LALKELA .

-- Greg Medlock, PhD Postdoctoral Research Associate Moore Lab, Child Health Research Center Department of Pediatrics, Division of Gastroenterology & Nutrition University of Virginia School of Medicine

PkiwiBird commented 4 years ago

I opened the gapfilling script and changed the value to values below 1e-6. But got same error even after changing to 1e-100.

Can you describe how exactly you did that? Typically, solvers have a limit at 10^-9 or 10^-10. Otherwise @cdiener might be able to provide more insights.

I created a copy of the script of gapfilling and replaced 1e-6 by 1e-100 on the line

def __init__(self, model, universal=None, lower_bound=0.05,
                 penalties=None, exchange_reactions=False,
                 demand_reactions=True, integer_threshold=1e-100) .

I was using cplex, although here I provided models after doing model.solver = "glpk" , cause in a previous issue Christian Dienner said he couldn’t unpickle model unless solver was glpk.

PkiwiBird commented 4 years ago

Hmm, hard to tell without seeing more of what was done. Could be number of things. The error means that it can't gapfill (no addition of reactions allows flux through the objective).

Did you impose a lower bound on your objective or any other flux that prohibits it from being zero? The infeasibility error suggest you may and that may be a problem for gapfilling as well. Does the universal model have a feasible solution when you block demand reactions?

The flux constrains are as shown in the pickle models I sent. I put all “EX_...” reactions going out by default, with bounds (0, 1000), except for bounds of experimental medium that I put as in excel added below. With these constrains the complete model was optimal when doing model.optimize().status, the incomplete model (to which I applied FASTCORE) gave the problems explained above RPMI_bounds_Recon3D3.01.xlsx

PkiwiBird commented 4 years ago

Also check for duplicate reaction IDs across the universal and model—duplicate IDs for reactions with different bounds can lead to this issue (although should have raised warnings along the way). On Tue, Mar 3, 2020 at 12:56 PM Christian Diener @.***> wrote: Hmm, hard to tell without seeing more of what was done. Could be number of things. The error means that it can't gapfill (no addition of reactions allows flux through the objective). Did you impose a lower bound on your objective or any other flux that prohibits it from being zero? The infeasibility error suggest you may and that may be a problem for gapfilling as well. Does the universal model have a feasible solution when you block demand reactions? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#941?email_source=notifications&email_token=ABRXYGKJZSZFTR2FOBZ5RZLRFVAEZA5CNFSM4LALKELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENUP5UY#issuecomment-594083539>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRXYGNRDSHTRC7Z45KAXJDRFVAEZANCNFSM4LALKELA . -- Greg Medlock, PhD Postdoctoral Research Associate Moore Lab, Child Health Research Center Department of Pediatrics, Division of Gastroenterology & Nutrition University of Virginia School of Medicine

Don't think that's the issue cause even if I try to add a reaction that already exists says: 'Ignoring reaction .... since it already exists'

PkiwiBird commented 4 years ago

Hmm, hard to tell without seeing more of what was done. Could be number of things. The error means that it can't gapfill (no addition of reactions allows flux through the objective). Did you impose a lower bound on your objective or any other flux that prohibits it from being zero? The infeasibility error suggest you may and that may be a problem for gapfilling as well. Does the universal model have a feasible solution when you block demand reactions?

The flux constrains are as shown in the pickle models I sent. I put all “EX_...” reactions going out by default, with bounds (0, 1000), except for bounds of experimental medium that I put as in excel added below. With these constrains the complete model was optimal when doing model.optimize().status, the incomplete model (to which I applied FASTCORE) gave the problems explained above RPMI_bounds_Recon3D3.01.xlsx

Do you have nay suggestions or need me to provide any more info? if so please tell

Midnighter commented 4 years ago

I haven't had a chance yet to try out your models.

gregmedlock commented 4 years ago

I'm giving the models a look now, but gap filling has been running for ~3 hours and hasn't completed yet. I don't think this issue can be debugged without having a fully reproducible example. Have you verified that after a solution is found, the model fails to validate for these exact models? In your initial post, you said this occurs for some models; this may be a wild goose chase (requiring >3 hours of effort per goose chase) if the models you shared don't actually have the problem. Can you provide a FASTCORE-constrained model that has a larger fraction of the original reactions (the current incomplete model has 1/2 the reactions)? This would make debugging much easier.

However, I'm not sure I see the purpose of your analysis--you've applied FASTCORE to reduce the model, and are now expanding the model to enable flux through the biomass objective function. I may be misunderstanding the problem formulations in FASTCORE, but it seems like you should be able to constrain flux through your objective function to be above some minimal value at that point, ensuring your FASTCORE-constrained model can produce biomass. At the very least, including the biomass reaction in your input reaction list for FASTCORE would guarantee that the constrained network can produce biomass.

Midnighter commented 4 years ago

Thank you for digging into it Greg.

PkiwiBird commented 4 years ago

I'm giving the models a look now, but gap filling has been running for ~3 hours and hasn't completed yet. I don't think this issue can be debugged without having a fully reproducible example. Have you verified that after a solution is found, the model fails to validate for these exact models? In your initial post, you said this occurs for some models; this may be a wild goose chase (requiring >3 hours of effort per goose chase) if the models you shared don't actually have the problem. Can you provide a FASTCORE-constrained model that has a larger fraction of the original reactions (the current incomplete model has 1/2 the reactions)? This would make debugging much easier.

However, I'm not sure I see the purpose of your analysis--you've applied FASTCORE to reduce the model, and are now expanding the model to enable flux through the biomass objective function. I may be misunderstanding the problem formulations in FASTCORE, but it seems like you should be able to constrain flux through your objective function to be above some minimal value at that point, ensuring your FASTCORE-constrained model can produce biomass. At the very least, including the biomass reaction in your input reaction list for FASTCORE would guarantee that the constrained network can produce biomass.

thank you for your time. the model I sent was the one that failed. However, I sent you the one stored in my laptop (uses a different solver) than the one I'm working with in the server. Below is the one on the server. ‘model’ is the complete model ‘ModelFASTCORE’ is a dictionary with 4 models that had FASTCORE applied.

The ‘biomass_reaction’ id shows that all models have a biomass reaction with bounds (0.0001,1000) before gapfilling. I am sorry I don't understand your point.

I'm using gapfill(model=ModelFASTCORE[sample] universal=model, demand_reactions=False) and at least seems that with a higher integer_threshold like 1e-9 runs faster...

models.zip

PkiwiBird commented 4 years ago

It is also happening when doing gapfiling for metabolic tasks. If integer_threshold is 1e-9 gives error try lowering integer_threshold and if it's 1e-12 gives this error: cobra.exceptions.OptimizationError: gapfilling optimization failed (check_original_solver_status)

I leave here a reproducible example: ubuntu16.04_python3.5.2_cplex12.zip test_1104.py is the script with code for gapfill Recon3D_3.01DrainAdapt is a pickle of a universal/complete model SampleModelsGapFilled is a pickle of dictionary of two models, the model that gives error has key 'GSM553727' 56metTasks4Recon3D3.01.xlsx is file with metabolic tasks to test gapfilling.py is the cobrapy gapfilling script but with integer_threshold value altered (1e-12)

ubuntu16.04_python3.5.2_cplex12.8 contains files used to create docker image (in case you want): -> Dockerfile contains commands to install programs: cplex12.8, python3.5.2, etc ->myresponse.properties file with properties needed to install cplex -> it is missing 'cplex_studio128.linux-x86-64.bin' cause its academic version of cplex (not allowed to share)

Please help!

PkiwiBird commented 4 years ago

I encountered another similar problem. This time:

  1. I adapted the universal model for medium composition: I closed entry of metabolites in universal model except those of medium. the model produced biomass.
  2. Applied FASTCORE to to universal model (not adapted for medium composition) and then adapted that reconstructed model to medium composition. Recontructed model did not produce biomass.
  3. gapfill reconstructed model 2 using model 1 as universal, to get a reconstructed model able to produce biomass. gapfill(model2, model1, demand_reactions=False) Gave error: "_RuntimeError: failed to validate gapfilled model, try lowering the integerthreshold" Tried lowering threshold to 1e-9, 1e-12 and 0. Gave same error. Bellow are the models. Any sugestion is welcome. models.zip
gregmedlock commented 4 years ago

Hi @PkiwiBird I'm still tracking down the problem. To simplify things, I am using a smaller universal model which only contains reactions that are missing from model2 and are active in a pFBA solution for model1. I am able to get feasible gap filling solutions when I do this, but the model does not pass the validate() function. One problem is that there are gap-filled reactions in the solution that have non-zero flux values yet have a primal value of 0.0 in the solution (and therefore aren't returned in the reaction list used in validate()). Decreasing the solver tolerances directly, rather than using the integer_threshold argument (which only adjusts integrality), corrects this and the gap fill solution passes the validate() function. For example, when I run the following after removing reactions from model1 that aren't active in a pFBA solution:

gapfiller = cobra.flux_analysis.gapfilling.GapFiller(model2, universal=model1, demand_reactions=False, integer_threshold=1e-10)
gapfiller.model.solver.configuration.tolerances.feasibility = 1e-10
gapfiller.model.solver.configuration.tolerances.integrality = 1e-10
gapfiller.model.solver.configuration.tolerances.optimality = 1e-10
gapfiller.fill()

This returns the following solution:

[[<Reaction HMR_4122 at 0x144596080>, <Reaction HMR_4354 at 0x144596128>, <Reaction HMR_4038 at 0x1445961d0>, <Reaction HMR_4040 at 0x144596208>, <Reaction HMR_4085 at 0x144596240>, <Reaction HMR_4406 at 0x144596278>, <Reaction HMR_4453 at 0x1445962b0>, <Reaction HMR_4617 at 0x1445962e8>, <Reaction HMR_4695 at 0x144596390>, <Reaction HMR_4799 at 0x1445963c8>, <Reaction HMR_4802 at 0x144596400>, <Reaction HMR_4804 at 0x144596438>, <Reaction HMR_4808 at 0x144596470>, <Reaction HMR_4810 at 0x1445964a8>, <Reaction HMR_4812 at 0x1445964e0>, <Reaction HMR_4814 at 0x144596518>, <Reaction HMR_4034 at 0x144596550>, <Reaction HMR_4036 at 0x144596588>, <Reaction HMR_4050 at 0x1445965c0>, <Reaction HMR_4608 at 0x1445965f8>, <Reaction HMR_4644 at 0x144596630>, <Reaction HMR_4026 at 0x144596668>, <Reaction HMR_4806 at 0x1445966a0>, <Reaction HMR_6974 at 0x1445967f0>, <Reaction HMR_6982 at 0x144596828>, <Reaction HMR_6725 at 0x144596860>, <Reaction HMR_6768 at 0x144596898>, <Reaction HMR_6782 at 0x1445968d0>, <Reaction HMR_4324 at 0x144596908>, <Reaction HMR_4326 at 0x144596940>, <Reaction HMR_4454 at 0x144596a20>, <Reaction HMR_2365 at 0x144596ac8>, <Reaction HMR_2558 at 0x144596b00>, <Reaction HMR_2559 at 0x144596b38>, <Reaction HMR_0582 at 0x144596cf8>, <Reaction HMR_0584 at 0x144596d30>, <Reaction HMR_7663 at 0x144596eb8>, <Reaction HMR_4160 at 0x144596ef0>, <Reaction HMR_4163 at 0x144596f28>, <Reaction HMR_4165 at 0x144596f60>, <Reaction HMR_4746 at 0x1445ac048>, <Reaction HMR_4748 at 0x1445ac080>, <Reaction HMR_4750 at 0x1445ac0b8>, <Reaction HMR_6659 at 0x1445ac128>, <Reaction HMR_6661 at 0x1445ac198>, <Reaction HMR_8697 at 0x1445ac1d0>, <Reaction HMR_6506 at 0x1445ac240>, <Reaction HMR_4071 at 0x1445ac278>, <Reaction HMR_5363 at 0x1445ac358>, <Reaction HMR_4062 at 0x1445ac390>, <Reaction HMR_4993 at 0x1445ac400>, <Reaction HMR_8696 at 0x1445ac588>, <Reaction HMR_3946 at 0x1445ac668>, <Reaction HMR_9144 at 0x1445ac908>, <Reaction 4HBZCOAFm at 0x1445ac9b0>, <Reaction 4HBZFm at 0x1445ac9e8>, <Reaction COUCOAFm at 0x1445acb00>, <Reaction T4HCINNMFM at 0x1445acc18>, <Reaction THMMPt4 at 0x1445acc50>, <Reaction r0915 at 0x1445acc88>, <Reaction RE3490C at 0x1445acd68>, <Reaction 34HPLtm at 0x145993080>, <Reaction PGLYCtm at 0x1459930b8>]]

I'm not 100% sure this will fix everything for you because I cannot get gapfiller.fill() to complete when I use your complete models. I recommend you use purely continuous gap filling rather than MIP.

@cdiener @Midnighter Any clue as to why primal values would be getting rounded to zero with default tolerances except for integrality? I am going in circles looking through optlang but cannot find the culprit, and this is hard to produce a more minimal example for... but I will share the reduced universal model I am using in a minute.

cdiener commented 4 years ago

Hi @gregmedlock, could you give an example for

One problem is that there are gap-filled reactions in the solution that have non-zero flux values yet have a primal value of 0.0 in the solution

...because I don't fully get that yet.

Also for GLPK integrality and optimality don't do anything anymore since GLPK removed those from it's interface and now uses some other convergence measures which differ between solution methods (LP, IP, MIP).

gregmedlock commented 4 years ago

Yes, will post the example early tomorrow. To clarify, there is flux through the reaction in the solution, but the indicator variable for the reaction has a primal of 0.0. I will see if I can use both GLPK and CPLEX in the example.

gregmedlock commented 4 years ago

I've attached a Jupyter notebook demonstrating the issue with indicator variables have 0.0 primal values even when they have flux in the gapfill solution. The only thing I could do to reproducibly resolve the problem was enabling presolve in GLPK (through model.solver.configuration._iocp.presolve = 1).

I guess this is a scaling issue that there won't be a generalizable solution to, but I would suggest the following changes:

1) Update GapFiller to use tolerances.feasibility rather than tolerances.integrality when GLPK is being used 2) Make the error message for GapFiller.fill() slightly more verbose when validate() fails, and/or: 3) Make presolve=True the default. I can't think of any problems this would create, but admit I have blind spots here. 4) Prioritize some documentation describing gapfilling debugging strategies in doc revamps.

Thoughts? I'd be happy to tackle this, as I've been putting off some gap filling contributions for quite some time that would move flux-minimization-based gapfilling from Medusa into COBRApy (see _continuous_iterative_binary_gapfill in https://github.com/opencobra/Medusa/blob/development/medusa/reconstruct/expand.py if interested)

@PkiwiBird you can give the presolve approach a try, but I'll just emphasize again that I recommend a purely LP approach rather than MIP, especially with models this large. Happy to chat about how to implement that if you'd like.

mip_debug.html.zip

Midnighter commented 4 years ago

Is there a good reason that speaks against using model.tolerance which sets the same value on all available tolerances?

gregmedlock commented 4 years ago

@Midnighter in practice I always see the integrality tolerance set 1-2 orders of magnitude higher than other tolerances, but I'm not sure why. Just using model.tolerance to set all tolerances based on the user-passed integer_threshold seems like a safe default approach to me (at least for gapfilling)... In the worst case, solve times would be longer, but I think finding a real solution is more important.

This does raise some issues about "silent" problems right be desirable to catch. For example, the GLPK interface only accepts a feasibility tolerance:

    def _tolerance_functions(self):
        return {
            "feasibility": (self._get_feasibility, self._set_feasibility)
            }

but the configuration still ends up with all three tolerance options for the model in the GapFiller because when a model is copied it ends up with all three tolerance options even if the solver is GLPK (not sure why this is), e.g.,:

>>> gf.model.solver.configuration._tolerance_functions()
{'feasibility': (<function Configuration._tolerance_functions.<locals>.<lambda> at 0x123780ee0>, <function Configuration._tolerance_functions.<locals>.<lambda> at 0x121301ca0>),
 'optimality': (<function Configuration._tolerance_functions.<locals>.<lambda> at 0x121301ee0>, <function Configuration._tolerance_functions.<locals>.<lambda> at 0x1221fcee0>), 
'integrality': (<function Configuration._tolerance_functions.<locals>.<lambda> at 0x1221fc1f0>, <function Configuration._tolerance_functions.<locals>.<lambda> at 0x1221fc310>)}

So a user can set new values for each of these tolerances and expect a change, but they wouldn't be alerted that their solver doesn't actually use the parameter they are setting unless they dig into the docs for Optlang and their solver pretty deeply. Model.tolerance is currently trying to catch this but wouldn't log it when the fields actually do exist (like when the model is copied):

    @property
    def tolerance(self):
        return self._tolerance

    @tolerance.setter
    def tolerance(self, value):
        solver_tolerances = self._solver.configuration.tolerances

        try:
            solver_tolerances.feasibility = value
        except AttributeError:
            logger.info("The current solver doesn't allow setting"
                        "feasibility tolerance.")

        try:
            solver_tolerances.optimality = value
        except AttributeError:
            logger.info("The current solver doesn't allow setting"
                        "optimality tolerance.")

        try:
            solver_tolerances.integrality = value
        except AttributeError:
            logger.info("The current solver doesn't allow setting"
                        "integrality tolerance.")

        self._tolerance = value

I may also be missing something about optlang, but I think there is a mismatch between current GLPK parameters and the optlang interface--for MIP, GLPK has both an integrality tolerance and an optimality tolerance (tol_int and tol_obj, respectively). The feasibility tolerance does currently adjust both of those parameters, however.

So in sum I think 1) yes to just using Model.tolerance to adjust the integer threshold/other tolerances for gapfilling, and 2) some fix might be needed related to copy() to stop initializing solver tolerances that don't exist.

cdiener commented 4 years ago

I think we should probably fix the optlang tolerances for MIP problems. This is what you could set on GLPK MIP problems:

double tol int (default: 1e-5)
Absolute tolerance used to check if optimal solution to the current LP relaxation is integer
feasible. (Do not change this parameter without detailed understanding its purpose.)

double tol obj (default: 1e-7)
Relative tolerance used to check if the objective value in optimal solution to the current LP
relaxation is not better than in the best known integer feasible solution. (Do not change this
parameter without detailed understanding its purpose.)

double mip gap (default: 0.0)
The relative mip gap tolerance. If the relative mip gap for currently known best integer feasible
solution falls below this tolerance, the solver terminates the search. This allows obtainig suboptimal
integer feasible solutions if solving the problem to optimality takes too long time.

I think always using the MIP presolver is fine. For LPs this can affect convergence negatively.

PkiwiBird commented 4 years ago

Hi @PkiwiBird I'm still tracking down the problem. To simplify things, I am using a smaller universal model which only contains reactions that are missing from model2 and are active in a pFBA solution for model1. I am able to get feasible gap filling solutions when I do this, but the model does not pass the validate() function. One problem is that there are gap-filled reactions in the solution that have non-zero flux values yet have a primal value of 0.0 in the solution (and therefore aren't returned in the reaction list used in validate()). Decreasing the solver tolerances directly, rather than using the integer_threshold argument (which only adjusts integrality), corrects this and the gap fill solution passes the validate() function. For example, when I run the following after removing reactions from model1 that aren't active in a pFBA solution:

gapfiller = cobra.flux_analysis.gapfilling.GapFiller(model2, universal=model1, demand_reactions=False, integer_threshold=1e-10)
gapfiller.model.solver.configuration.tolerances.feasibility = 1e-10
gapfiller.model.solver.configuration.tolerances.integrality = 1e-10
gapfiller.model.solver.configuration.tolerances.optimality = 1e-10
gapfiller.fill()

This returns the following solution:

[[<Reaction HMR_4122 at 0x144596080>, <Reaction HMR_4354 at 0x144596128>, <Reaction HMR_4038 at 0x1445961d0>, <Reaction HMR_4040 at 0x144596208>, <Reaction HMR_4085 at 0x144596240>, <Reaction HMR_4406 at 0x144596278>, <Reaction HMR_4453 at 0x1445962b0>, <Reaction HMR_4617 at 0x1445962e8>, <Reaction HMR_4695 at 0x144596390>, <Reaction HMR_4799 at 0x1445963c8>, <Reaction HMR_4802 at 0x144596400>, <Reaction HMR_4804 at 0x144596438>, <Reaction HMR_4808 at 0x144596470>, <Reaction HMR_4810 at 0x1445964a8>, <Reaction HMR_4812 at 0x1445964e0>, <Reaction HMR_4814 at 0x144596518>, <Reaction HMR_4034 at 0x144596550>, <Reaction HMR_4036 at 0x144596588>, <Reaction HMR_4050 at 0x1445965c0>, <Reaction HMR_4608 at 0x1445965f8>, <Reaction HMR_4644 at 0x144596630>, <Reaction HMR_4026 at 0x144596668>, <Reaction HMR_4806 at 0x1445966a0>, <Reaction HMR_6974 at 0x1445967f0>, <Reaction HMR_6982 at 0x144596828>, <Reaction HMR_6725 at 0x144596860>, <Reaction HMR_6768 at 0x144596898>, <Reaction HMR_6782 at 0x1445968d0>, <Reaction HMR_4324 at 0x144596908>, <Reaction HMR_4326 at 0x144596940>, <Reaction HMR_4454 at 0x144596a20>, <Reaction HMR_2365 at 0x144596ac8>, <Reaction HMR_2558 at 0x144596b00>, <Reaction HMR_2559 at 0x144596b38>, <Reaction HMR_0582 at 0x144596cf8>, <Reaction HMR_0584 at 0x144596d30>, <Reaction HMR_7663 at 0x144596eb8>, <Reaction HMR_4160 at 0x144596ef0>, <Reaction HMR_4163 at 0x144596f28>, <Reaction HMR_4165 at 0x144596f60>, <Reaction HMR_4746 at 0x1445ac048>, <Reaction HMR_4748 at 0x1445ac080>, <Reaction HMR_4750 at 0x1445ac0b8>, <Reaction HMR_6659 at 0x1445ac128>, <Reaction HMR_6661 at 0x1445ac198>, <Reaction HMR_8697 at 0x1445ac1d0>, <Reaction HMR_6506 at 0x1445ac240>, <Reaction HMR_4071 at 0x1445ac278>, <Reaction HMR_5363 at 0x1445ac358>, <Reaction HMR_4062 at 0x1445ac390>, <Reaction HMR_4993 at 0x1445ac400>, <Reaction HMR_8696 at 0x1445ac588>, <Reaction HMR_3946 at 0x1445ac668>, <Reaction HMR_9144 at 0x1445ac908>, <Reaction 4HBZCOAFm at 0x1445ac9b0>, <Reaction 4HBZFm at 0x1445ac9e8>, <Reaction COUCOAFm at 0x1445acb00>, <Reaction T4HCINNMFM at 0x1445acc18>, <Reaction THMMPt4 at 0x1445acc50>, <Reaction r0915 at 0x1445acc88>, <Reaction RE3490C at 0x1445acd68>, <Reaction 34HPLtm at 0x145993080>, <Reaction PGLYCtm at 0x1459930b8>]]

I'm not 100% sure this will fix everything for you because I cannot get gapfiller.fill() to complete when I use your complete models. I recommend you use purely continuous gap filling rather than MIP.

@cdiener @Midnighter Any clue as to why primal values would be getting rounded to zero with default tolerances except for integrality? I am going in circles looking through optlang but cannot find the culprit, and this is hard to produce a more minimal example for... but I will share the reduced universal model I am using in a minute.

@gregmedlock: I tried and when running gapfiller.model.solver.configuration.tolerances.feasibility = 1e-10 gives error: Traceback (most recent call last): File "<input>", line 1, in <module> File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/optlang/util.py", line 289, in __setattr__ self._functions[key][1](value) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_parameter_classes.py", line 66, in set self._env.parameters._set(self._id, value, self._type) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_parameter_classes.py", line 306, in _set CPX_PROC.setdblparam(self._env._e, which_parameter, value) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_procedural.py", line 438, in setdblparam check_status(env, status) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_procedural.py", line 303, in __call__ raise CplexSolverError(error_string, env, status) cplex.exceptions.errors.CplexSolverError: CPLEX Error 1014: Parameter value too small. and then running: gapfiller.model.solver.configuration.tolerances.optimality = 1e-10 gives error: Traceback (most recent call last): File "<input>", line 1, in <module> File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/optlang/util.py", line 289, in __setattr__ self._functions[key][1](value) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_parameter_classes.py", line 66, in set self._env.parameters._set(self._id, value, self._type) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_parameter_classes.py", line 306, in _set CPX_PROC.setdblparam(self._env._e, which_parameter, value) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_procedural.py", line 438, in setdblparam check_status(env, status) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cplex/_internal/_procedural.py", line 303, in __call__ raise CplexSolverError(error_string, env, status) cplex.exceptions.errors.CplexSolverError: CPLEX Error 1014: Parameter value too small.

gregmedlock commented 4 years ago

@PkiwiBird try with GLPK as your solver, or increase the values to within the acceptable range for CPLEX (don't know these off the top of my head, but we will update the warning messages to return the acceptable range).

PkiwiBird commented 4 years ago

@PkiwiBird try with GLPK as your solver, or increase the values to within the acceptable range for CPLEX (don't know these off the top of my head, but we will update the warning messages to return the acceptable range).

It worked! Thanks!

PkiwiBird commented 4 years ago

Hi, now I'm having another error when I try to gapfill for tasks:

Traceback (most recent call last):
  File "<input>", line 72, in <module>
  File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/flux_analysis/gapfilling.py", line 234, in fill
    message='gapfilling optimization failed')
  File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/core/model.py", line 1055, in slim_optimize
    assert_optimal(self, message)
  File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/util/solver.py", line 441, in assert_optimal
    raise exception_cls("{} ({})".format(message, status))
cobra.exceptions.Infeasible: gapfilling optimization failed (infeasible)
models with task added are here (pickle format):
[models2507.zip](https://github.com/opencobra/cobrapy/files/4975440/models2507.zip)

I tried to save them as sbml but gave another error:

Traceback (most recent call last): File "", line 1, in File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/io/sbml.py", line 888, in write_sbml_model doc = _model_to_sbml(cobra_model, f_replace=f_replace, **kwargs) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/io/sbml.py", line 1034, in _model_to_sbml specie.setCompartment(metabolite.compartment) File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/libsbml/init.py", line 25822, in setCompartment return _libsbml.Species_setCompartment(self, sid) ValueError: invalid null reference in method 'Species_setCompartment', argument 2 of type 'std::string const &'

matthiaskoenig commented 4 years ago

The problem is that you have metabolites which don't have a compartment set. This gives a null reference when trying to write them in SBML. The error message could be better here, but the underlying problem are metabolites with no metabolite.compartment set in your resutls. Probably the gap filling is not setting the compartments?

PkiwiBird commented 4 years ago

Hi, by setting compartment for all metabolites as @matthiaskoenig suggested corrected the problem of saving in SBML (models bellow), but the gapfill error continues to show:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/flux_analysis/gapfilling.py", line 234, in fill
    message='gapfilling optimization failed')
  File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/core/model.py", line 1055, in slim_optimize
    assert_optimal(self, message)
  File "/home/tania/anaconda3/envs/py_3.5.6/lib/python3.5/site-packages/cobra/util/solver.py", line 441, in assert_optimal
    raise exception_cls("{} ({})".format(message, status))
cobra.exceptions.Infeasible: gapfilling optimization failed (infeasible)

Could someone please help? models2607.zip

cdiener commented 4 years ago

Have you checked that the universal model does not use demand reaction to be feasible? Also may be a problem with bounds. What happens if you lower the lower_bound argument?

PkiwiBird commented 4 years ago

Lowering lower_bound to 0 worked, but I thought there should exist some flux through the objective function. Then using lower_bound = 0 is valid? Universal model is feasible and reconstructed not:

print(modelMedAdapCopy.optimize().status) # important
optimal
print(reconstModel.optimize().status)
infeasible

But objective value of universal model is 0 while of reconstructed model is positive:

print(modelMedAdapCopy.optimize().objective_value)
0.0
print(reconstModel.optimize().objective_value)
1.4576241916181945
cdiener commented 4 years ago

Yeah, I meant something non-zero but small like 1e-3 or less to see if the different bounds may be a problem.

PkiwiBird commented 4 years ago

with low value, 1e-3, doesn't work. Maybe because the objective value of universal model with task (and remaining boundaries closed) is 0, which is lower than 1e-3. From definition of passed task in literature: "a model successfully passes a task if the associated LP problem is still solvable when the sole exchange reactions allowed carrying flux in the model are temporary sink reactions associated with each of the inputs and outputs listed in the task." (Richele et al. 2019) seems that lower_bound can be 0, as what matters is if it is solvable (model.status = optimal) and not so much if biomass reac has flux

cdiener commented 4 years ago

The lower_bound does not define the lower bound on the biomass flux but the lower bound on your task, and that must have positive flux. See the definition above the phrase you quoted:

[a] metabolic task [is defined] as a nonzero flux through a reaction or through a pathway leading to the production of a metabolite B from a metabolite A [https://doi.org/10.1371/journal.pcbi.1006867]

So if the gapfilling is only feasible with lower bound of 0 that mains that your task can not carry flux. However, that may also be the case in your universal model and in that case there is no gapfilling that would allow flux through the task and gapfilling is infeasible. Have you checked that your universal model can carry flux for the task?

PkiwiBird commented 4 years ago

I checked and universal model has flux on all reactions of that task, when setting objective=biomass and also when setting objective to a reaction of the task. To do the way you explained, how do we set the gapfill function to know which reactions are part of the task, do we set model.objective = task and then gapfill? For the task above that would be ok cause there's only one reaction where lower bound is above 0 and could set model.objective equal to that reaction. But for a task like this: --> succinate[m] (bounds: 1, 1) --> NADH[m] (bounds: 1, 1) --> H+[m] (bounds: 1, 1) --> O2[s] (bounds: 0, 1000) fumarate[m] --> (bounds: 1, 1) NAD+[m] --> (bounds: 1, 1) H2O[s] --> (bounds: 0, 1000) ATP[m] + H2O[m] => ADP[m] + Pi[m] + H+[m] (bounds:1, 1000) For this task there's inflow, outflow and equation reactions that require flux (lower bound > 0), so which one should we set equal to model.objective?

cdiener commented 4 years ago

To do the way you explained, how do we set the gapfill function to know which reactions are part of the task, do we set model.objective = task and then gapfill?

Yes, exactly. This is how you would do this.

For more tasks this will be more involved. Basically you can run gapfilling consecutively on all your tasks. But other packages may provide support for that out of the box. I think @gregmedlock may have some suggestions and you may also find something in driven. CORDA also allows to specify tasks.

PkiwiBird commented 4 years ago

the above reactions are all part of same task: Oxidative phosphorylation in this file. If I understood well the suggestion is to run gapfill for each of the reactions; or is it to add all reactions of a task together and then gapfill several times, each time setting model objective to a different reaction of that task?

gregmedlock commented 4 years ago

The multi-reaction task makes a good case for separating fix_objective_as_constraint() from the initialization of GapFiller. @PkiwiBird, the "hacky" way to do this without modifying COBRApy source code would be to construct the Gapfiller object as you have been doing, then just remove the lower bound constraint on your biomass reaction and make sure the bounds are set on your task reactions such that any feasible solution passes the task, e.g. lower bound > 0 for all task reactions if the forward direction is required to pass the task.

The alternative would be to construct an objective function which requires simultaneous flux through all reactions in your task as @cdiener recommended. But this objective would look a bit odd and would require adding the lower bound constraints I mentioned above to your reactions in the task anyway.

cdiener commented 4 years ago

Oh I meant gapfill each task separately. So something like:

  1. set objective as task 1
  2. gapfill and create new model 2
  3. set task 2 as objective for model 2
  4. gapfill and get model 3 ...

Setting them simultaneously has the caveat that one task might be impossible and that would fail the entire gapfill. Usually you want to just enable as many tasks as possible even though all of them may not be possible.

cdiener commented 3 years ago

Seems to be stale. Integer performance has also improved with https://github.com/opencobra/optlang/pull/231.