opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
467 stars 218 forks source link

Results change if I don't optimize() before constraining #1233

Closed erynes closed 2 years ago

erynes commented 2 years ago

I loaded the model from iCHOv1_K1_final.xml (https://chogenome.org/data.php, third reference, .zip file), in which glucose uptake (EX_glc_E_e) is constrained to be -0.1979 (bounds = (-0.1979, -0.1979)). The model's irreversible reaction PDHm is unconstrained (bounds = (0, 1000)). If I change the glucose uptake to -0.3 and call optimize() with the default objective, the flux through PDHm is very different (0!) if I don't optimize before changing this constraint. And if I then restore that constraint to its original value and re-optimize, I still get 0 flux through PDHm, instead of recovering the correct value.

k1 = cobra.io.sbml.read_sbml_model('path/to/iCHOv1_K1_final.xml')
k1.optimize()
print(f'Glc = {k1.reactions.EX_glc_e_.flux}, PDHm = {k1.reactions.PDHm.flux:.4f}')
k1.reactions.EX_glc_e_.bounds = (-.3, -.3)
k1.optimize()
print(f'Glc = {k1.reactions.EX_glc_e_.flux}, PDHm = {k1.reactions.PDHm.flux:.4f}')

Glc = -0.1979, PDHm = 0.2718 Glc = -0.3, PDHm = 0.3159

k1 = cobra.io.sbml.read_sbml_model('path/to/iCHOv1_K1_final.xml')
k1.reactions.EX_glc_e_.bounds = (-.3, -.3)
k1.optimize()
print(f'Glc = {k1.reactions.EX_glc_e_.flux}, PDHm = {k1.reactions.PDHm.flux:.4f}')
k1.reactions.EX_glc_e_.bounds = (-.1979, -.1979)
k1.optimize()
print(f'Glc = {k1.reactions.EX_glc_e_.flux}, PDHm = {k1.reactions.PDHm.flux:.4f}')

Glc = -0.3, PDHm = 0.0000 Glc = -0.1979, PDHm = 0.0000

Maybe I wind up in a different region of "solution space" in the second example, but the difference seems too extreme to me... and shouldn't I be able to recover the original flux if I reset the constraint to its original value?

I'm using cobra 0.24.0 with glpk 1.5.2 and python 3.9.7 on OSX 12.4.

If this isn't a bug, could you please advise, and help me understand this behavior? I'm very new to FBA and CobraPy. Thanks!

Eric

cdiener commented 2 years ago

Like you said, the only thing guaranteed to be unique is the objective under optimality. There may be a huge range of fluxes that yield this optimum. You can use Flux Variability Analysis to see the flexibility each flux has under an optimal objective. If you are rather interested in a method that gives you very fixed flux estimates you can use Parsimonious FBA instead.

erynes commented 2 years ago

Thanks Christian! Just now I took your advice and experimented with pFBA on the same model, with the default objective of maximizing biomass (as before). When sweeping across a range of increasing glucose uptakes, I get very different biological behaviors from FBA and pFBA.
Using FBA, biomass flux increases and asymptotes at ~0.0229 while pyruvate dehydrogenase (PDHm) flux continues increasing after growth has stalled and then decreases linearly.
Using pFBA, PDHm increases and asymptotes while biomass rapidly increases linearly to ~20.1 (>880x its FBA max) and then briefly drops and then resumes steadily (but slowly) increasing linearly. This linear growth increase persists when I increase the glucose 10-fold.
I'm skeptical of PDHm decreasing linearly with excessive glucose, and I'm also skeptical of the biomass flux increasing linearly with excessive glucose, so I'm unsure whether to trust the FBA result, the pFBA result, or neither. (I'm new to biochemistry too, cough cough.) Thoughts?
Eric

cdiener commented 2 years ago

Hmmm, I don't think I understand. With the same import flux bounds FBA and pFBA have to return the same growth rate. Maybe you are showing the pFBA objective instead?

erynes commented 2 years ago

Ah, oops, thanks, that's exactly what I was doing wrong, monitoring the pFBA objective instead of pFBA's flux through the biomass reaction. It's looking good to me now, so I think I'll keep using pFBA. Thanks again for your very helpful help!

It still seems odd to me that, with FBA, if I start with GeM-wide boundary conditions X, tweak one uptake value to get X', optimize, restore to X, and re-optimize, I don't recover a flux I get from optimizing X straightaway. Do you think this behavior is coming from glpk or from CobraPy, or is a common occurrence in FBA?

cdiener commented 2 years ago

So there is two things going on. (1) For low glucose/low biomass import you don't really need PDHm and (2) the solver recycles previous solution bases so consecutive solutions tend to be close to a previous one. So if you optimize before with high glucose import it finds a solution it gives you higher PDHm because you need it there. When you know lower the glucose import it starts solving from the previous solution which gives you a new solution where PDHm is not required but also not prohibited so you get something close to the previous PDHm. In the second case you start with a low glucose import in the first optimization and it drops PDHm to zero as solvers often initialize close to the trivial solution (S * 0 = 0) and PDHm is not required there. The second solution again comes from the basis recycling.

In our courses I usually say that FBA aims to shrink the possible flux space to a biological relevant one which may still have a lot of flexibility like in the figure below.

Getting to a truly unique set of flux values usually requires additional information. You can still use that uncertainty. For instance you can use flux variability analysis to get the min and max PDHm flux for each glucose import. That would give you a PDHm envelope which may still be informative and can be probed experimentally.

erynes commented 2 years ago

Huge thanks, Christian. The one clarification I'd like to make re: my standard-FBA observations is that when I initialize with glucose=0.1979 and then optimize for biomass, I get flux=0.2718 through PDHm, and that when I instead initialize with higher glucose (0.3) and optimize for biomass, flux through PDHm decreases to 0. I went back and looked at the top 20 reactions by |flux| in each of the e, c, and m compartments, and found that the solver indeed lands in a qualitatively different area of solution space when I initialize with 0.3 instead of 0.1979: fumrase flux in the cytosol drops by >60%, pyruvate metabolism in the cytosol shoots up (and pyruvate is suddenly secreted), tryptophan metabolism in the cytosol disappears, glycolosis/gluconeogenesis reactions join the top 20 fluxes in cytosol, aldose reductase flux almost triples, etc.
Lessons learned. Thanks again, so much!
Eric