opencobra / cobrapy

COBRApy is a package for constraint-based modeling of metabolic networks.
http://opencobra.github.io/cobrapy/
GNU General Public License v2.0
460 stars 216 forks source link

Help with gapfill #701

Closed caonetto closed 6 years ago

caonetto commented 6 years ago

Hi, I just started using cobrapy to test different bacterial strains growth, however I very unexperienced in this area. There is a published metabolic model for my bacteria specie, and I eliminated the genes from the reference model that are not present in each strain. I saved each model independently with pickle. Many of the genes I deleted make the models inviable, so I want to use gapfilling to make them able to grow again and also to know which reactions are making them inviable and be able to save the gapfilled models. At the moment I am using the following gapfilling script without success. I am not very familiar with this method, but in the documentation site it says that gapfilling uses a "Universal" set of reactions to make the model viable again. Is there a way to do gapfilling but using the reactions of the original reference model of my bacteria in order to know specifically what reactions are making my specific model inviable? With this method I keep on getting the same results, "EX_biomass_c" which is not a reactions from my model, but from Smiley? I would really appreciate any help or guidance. Thanks.

models = glob.glob('*.pickle')
for m in models:
    model = pickle.load(open(m, 'rb'))
    solution = cobra.flux_analysis.gapfilling.gapfill(model, demand_reactions=False, exchange_reactions=1, iterations=1)
    for reaction in solution[0]:
        print(reaction.id)

EX_biomass_C <----- this is the output for everyone of my models.

Midnighter commented 6 years ago

One big question is the medium that you use. Are the real strains growing in a glucose minimal medium or in a complex medium such as yeast extract? Have you accounted for this when modelling the strains?

Many of the genes I deleted make the models inviable, so I want to use gapfilling to make them able to grow again and also to know which reactions are making them inviable and be able to save the gapfilled models.

I'm not sure I follow your logic here. You knock-out the genes in the reference model that are not present in your strains. Some models end up being non-viable. So far so good. Then you want to gap fill those reactions back in? That seems to defeat the purpose of creating the strain-specific models in the first place. Let's assume that you only want to do it in order to identify the problematic reactions.

Many reactions will have an associated GPR rule (cobra.Reaction.gene_reaction_rule) which determines whether a gene knock-out also prevents the reaction from occurring. You can see which reactions a gene is involved in (cobra.Gene.reactions).

There are then two ways to construct a 'universal' model (another term for reaction database). Either you pass in your reference model (universal=reference). The method will only consider those reactions that are not already in your strain-specific model. Or you create a new model and add only those reactions to it that were deleted from the reference model to create your strain-specific model. The result should be the same.

Unless you have experimental evidence that your strains are able to take up compounds from the medium that are not in the reference model, I would refrain from allowing the addition of exchange reactions (exchange_reactions=False). This would also circumvent the problem that you see: I think the method was only developed for models where the biomass is a reaction of many components. It seems that your model has a biomass compound (biomass_C) and then growth is possibly modelled as the maximization of its demand or so. Anyway, the result you're seeing is that the easiest way to enable growth is to add an exchange reaction for the biomass compound... not very useful unfortunately.

I hope these points help you to solve your problem. Otherwise please share your reference model and I can try to help you further.

caonetto commented 6 years ago

First of all thank you very much for your detailed response. I think I am going to try and be more clear on what my objectives are. I am trying to replicate some parts of the protocol described in this article: Monk, J. and Bosi, E., 2018. Integration of Comparative Genomics with Genome-Scale Metabolic Modeling to Investigate Strain-Specific Phenotypical Differences. In Metabolic Network Reconstruction and Modeling (pp. 151-175). Humana Press, New York, NY.

As a first step I would like to determine if my strains are auxotrophic to certain substrates from the model media. As a start I eliminated the genes not present in my strains from the model, unfortunately this made all the models inviable. To be able to make the models viable again I looked at each of the genes to be deleted using the single_gene_deletion (I am not sure if this is the correct approach) and kept the genes that would make the model viable, and used these as my gene deletion list to create each model. After this I, extracted the list of extracellular substrates from my model and performed single reaction deletions using this list to observe auxotrophies. The second thing I would like to test is the minimal carbon sources to support growth, however I am not sure as to how to approach this using cobrapy, and the protocol in the paper is not very straightforward. Any ideas or comments? Any help would be greatly appreciated.

Midnighter commented 6 years ago

So if I understand you correctly the first part seems more or less under control. I don't know how many genes you are looking at deleting for your strain-specific models. I think the single gene deletion is a good first estimate. You may, of course, find that combinations of genes still lead to dead cells.

For your second part, there is an open pull request https://github.com/opencobra/cobrapy/pull/589 which introduces a method to find a minimal medium supporting growth. I just did not manage yet to merge it and make a new release.

Midnighter commented 6 years ago

Since #589 is now part of the COBRApy release, I will close this issue. Please re-open if there is anything still unclear.