opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
460 stars 216 forks source link

Model objective after adding two models different between 0.5.11 and 0.6 #505

Closed mmundy42 closed 7 years ago

mmundy42 commented 7 years ago

I ran across this while getting mminte running on cobra 0.6. With cobra 0.5.11, when two models are added with the "+=" operator, the objective is updated too because of the way reaction objective coefficients are handled. For example, using the BT and FP models from Thiele, I can add a prefix to the reaction IDs, add the models together and both objective reactions are set in the combined model:

Original BT objective: {<Reaction Biomass_BT_v2 at 0x110953fd0>: 1.0}
Original BT objective value: 0.440738418826
Original FP objective: {<Reaction Biomass_FP at 0x110967c10>: 1.0}
Original FP objective value: 0.169337957302
Combined objective: {<Reaction FP_Biomass_FP at 0x1110f8310>: 1.0, <Reaction BT_Biomass_BT_v2 at 0x110953fd0>: 1.0}
Combined objective value: 1.10532735493

With cobra 0.6, when two models are added with the "+=" operator, the objective in the combined model is unchanged. For example:

Original BT objective: {<Reaction Biomass_BT_v2 at 0x110f2d510>: 1.0}
Original BT objective value: 0.440738418827
Original FP objective: {<Reaction Biomass_FP at 0x111d6c7d0>: 1.0}
Original FP objective value: 0.169337957301
Combined objective: {<Reaction BT_Biomass_BT_v2 at 0x110f2d510>: 1.0}
Combined objective value: 1.07958792212

Would it make sense for the "+" and "+=" operators to explicitly set the objective in cobra 0.6?

hredestig commented 7 years ago

I must say I find __add__ and __iadd__ a bit confusing altogether.. They also lack unit-tests I notice. I guess you preferred having the objective become the sum of the two objectives? I find that somewhat surprising, or at least not obvious behavior.. Also no direct feeling what the obvious behavior would be..

I think I'd favor removing them altogether and replace with a utility function combine_models which would allow arguments controlling what should happen with reactions that are already present in the other model and what should happen with the objective. What do you think? Is modela + modelb very useful? Is modela += modelb also important?

pstjohn commented 7 years ago

We should figure out what we want to do with the operator overloading methods. We talked about them in #311 and #400 as well.

For reactions, __imul__ and __add__ is sometimes useful, i.e. if I'm writing a reaction thats comprised of many individual parts. Like a biomass function, you can do something like

biomass_reaction = 5*atp_part + 2*nadh_part + precursor_part

For models, I honestly never use that functionality. @mmundy42, I'm guessing you find them useful? I would be in favor of a utility function like combine_models, it would then be trivial to implement __add__ and __iadd__ as just wrappers around the utility function.

hredestig commented 7 years ago

I think a simple 'rule' can be that when it is obvious what the effect should be, then overloading operators is an acceptable and good interface. For adding reactions this is the case, but less so for models..

cdiener commented 7 years ago

Even if we have combine_models we would have to define the behavior of the objective in that case. I agree that it's not super obvious what that should be. There is also a potential pitfall with reactions that appear in both models. What are everyone's preferences for the objective?

mmundy42 commented 7 years ago

Sorry for the delayed response, I was out of the office for a few days.

I'm using the "+=" operator to create a community model of multiple organisms. In this use case, I've added a prefix to all of the reactions and metabolites so there are no duplicates. And I want there to be separate objectives for each organism. In mminte, the community models always have two organisms and then the community model is optimized with both objectives, with species A knocked out, and with species B knocked out.

A combine_models() function with parameters to control how the objectives are handled would be fine. I don't need the operators to work.

mmundy42 commented 7 years ago

I haven't read the details, but there is a new paper on building community models using an algorithm called SteadyCom.

http://journals.plos.org/ploscompbiol/article/fileid=10.1371/journal.pcbi.1005539&type=printable

cdiener commented 7 years ago

@mmundy42 That's the paper from Costas Maranas' group! I talked to him recently and was waiting for that one :tada: However, your link does not work for me, but the following does:

http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005539

mmundy42 commented 7 years ago

@cdiener, I know this getting off topic, but I would like to discuss building community models with you. Maybe we can combine some of the things we are working on or get SteadyCom ported to cobrapy.

cdiener commented 7 years ago

@mmundy42 That sounds great. I don't if you have seen it but we can open private chats on gitter. I'll add you to one so we can talk :)

Midnighter commented 7 years ago

I know that Daniel Machado at EMBL is also working on implementing community modeling. His current implementation is, of course, in 'framed' but might still be worth to discuss general approaches with him. Or he might even feel like contributing to a small package that depends on cobrapy, like cobra-com or something.

hredestig commented 7 years ago

..and what is the preferred behavior of combine_models (merge_models)? Suggestion:

Midnighter commented 7 years ago

Signature of pandas.merge for inspiration.

DataFrame.merge(right, how='inner', on=None, left_on=None, right_on=None,
                left_index=False, right_index=False, sort=False,
                suffixes=('_x', '_y'), copy=True, indicator=False)

And pandas.join.

DataFrame.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
pstjohn commented 7 years ago

The other option could be a CommunityModel subclass that inherits from Model

Midnighter commented 7 years ago

The way I see it, there are two use cases:

  1. I want to build a consensus model from two or more pre-existing models. (We actually may have more sophisticated contributions in this direction coming in the medium-term future.)
  2. I want to build a community model.

I would argue that these use cases are best supported by:

  1. Some kind of merge function/method.
  2. A new class CommunityModel as you suggest @pstjohn. I guess for now the SBML will be a bit hacky since there's no official support for community models but that's life.

I would also argue that 1. should live inside of cobrapy, while 2. is a perfect candidate for a smaller subpackage that depends on cobrapy.

mmundy42 commented 7 years ago

@cdiener has the micom package (https://github.com/cdiener/micom) so there already is a package for community modeling built on top of cobrapy. And there is some basic support for community models in the mminte package (https://github.com/mmundy42/MMinte).

mminte implements "joint FBA" which "integrates metabolic reconstructions of individual microbial species into a multi-compartment model with a community compartment allowing for the exchange of metabolites between species. The optimization objective function is the sum of the biomass reactions of individual species". [from the SteadyCom paper referenced above].

I like @hredestig suggestions as that would allow a joint FBA to be easily implemented.

I also like the idea of a CommunityModel class but it seems that it will get complicated enough to merit its own package.

cdiener commented 7 years ago

I think the expected behavior would be just to add reactions and not add (or error) for ones that are already in the model. The compartmentalized version could be added and triggered with an argument, however that might be something that is better implemented in external packages. For instance in micom I also add attributes to the added reactions to track the the species model they came from etc.

Regarding the objective I think one small issue is that we broke backwards compatibility here and did not announce it. So maybe for now we should go back to the 0.5.x behavior and add all objective expressions and we can change that in a coming version with the appropriate release note...