m-jahn / genome-scale-models

Genome scale metabolic models in SBML format
GNU General Public License v3.0
7 stars 6 forks source link
bacteria computational-biology genome-scale-models microbiology python3 sbml-models

genome-scale-models

Genome scale metabolic models in SBML format

Example of the metabolic map from the Ralstonia eutropha genome scale model.

Contents

Models

Ralstonia eutropha (Cupriavidus necator)

Original publications

  1. The model was previously published in: Park, J. M., Kim, T. Y., & Lee, S. Y. BMC Systems Biology, 5(1), 101. 2011, Genome-scale reconstruction and in silico analysis of the Ralstonia eutropha H16 for polyhydroxyalkanoate synthesis, lithoautotrophic growth, and 2-methyl citric acid production.

  2. The original model (PDF) was parsed and converted to SBML standard by Peyraud R., Cottret L., Marmiesse L., Gouzy J., Genin S. PLoS Pathogens, 12(10), 2016. A Resource Allocation Trade-Off between Virulence and Proliferation Drives Metabolic Versatility in the Plant Pathogen Ralstonia solanacearum.

Model contents

The Ralstonia_eutropha folder contains the following folders:

The Ralstonia_eutropha folder contains the following scripts:

Changes from original model

The following changes correct errors, remove unnecessary reactions, or add new reactions. The original model, for example, showed flux through artificial energy generating cycles (Fritzemeier et al., PLOS Comp Bio, 2017). After identification and removal of the following issues, no activity of such cycles was found anymore using FBA.

Other changes regarding to annotation:

Memote score

The Memote web service was used to test the capabilities of the model and identify problems. A comparison was made between the original published model (Memote_RehMBEL1391_sbml_L2V1), and the upgraded version of the model with the changes listed above (Memote_RehMBEL1391_sbml_L3V1). The memote score improved from 28% to 76% owing to the addition of metadata, and correcting many errors. The reports for the original and improved model can be found here and here.

Getting started

The genome scale metabolic models in this repository are encoded according to SBML standard and saved as human-readable *.xml or *.json files.

The models can have well over 1000 reactions, it is therefore recommended to work with these models using a frame work such as COBRApy. To install COBRApy follow the instructions on its github page. Installation of additional python 3 dependencies might be necessary for full functionality, such as libsbml, numpy, scipy, or pandas.

# for linux, run the following line in terminal
sudo apt install python3-pip
pip install cobra

To work with a model using COBRApy, we can import it in a python session. We can look at the number of reactions, metabolites and genes associated with the model's reactions.

# load libraries
import numpy as np
import pandas as pd
import tabulate
import cobra
import os
from os.path import join

# set the path to the model's directory
data_dir = '~/genome-scale-models/Ralstonia_eutropha/'
model = cobra.io.read_sbml_model(join(data_dir, "sbml/RehMBEL1391_sbml_L3V1.xml"))

# summary of the imported model
print('%i reactions' % len(model.reactions))
print('%i metabolites' % len(model.metabolites))
print('%i genes' % len(model.genes))

Structure of SBML models

The *.xml files containing the model definition have three (four) major slots:

Parameters and constraints

Genome scale models are constrained primarily by two things:

In COBRApy, we can look at the bounds of e.g. all exchange reactions (the ones supplying metabolites from the environment), and set them to a different value.

# inspect exchange reactions
model.exchanges.list_attr("bounds")

# we can also change bounds for reactions
for reaction in model.exchanges:
    reaction.lower_bound = 0.0
    reaction.upper_bound = 1000.0

In contrast to other types of models such as simple resource allocation models, genome scale models usually don't include cellular processes for production of macromolecules. In other words, transcription, translation, and DNA replication are not explicitly included in the model but only appear as abstract, lumped reactions.

Objective function

The objective function of the model is a variable similar to other variables that are optimized by the solver. However, when solving the model the prime target of the algorithm is to maximize this variable, often the Biomass reaction. Units of all reactions are by default in mmol per gDCW per h, but since the biomass reaction is per definition formulated such that 1 mmol biomass equals 1 g biomass, it also represents the specific growth rate μ (g biomass per gDCW per hour, biomass term can be eliminated).

In COBRApy we can tell the solver to optimize (usually: maximize) the flux through any reaction of choice. To maximize growth rate that would be the Biomass equation.

# set objective function
model.objective = {model.reactions.Biomass: 1}

Solving a model

Before we can pass the model to the solver and find the optimal flux distribution towards our goal, we have to define a growth medium (a set of exchange fluxes that represent the nutrients available in the outer environment of the cell).

model.medium = {
    'EX_mg2_e': 10.0,
    'EX_pi_e': 100.0,
    'EX_cobalt2_e': 10.0,
    'EX_cl_e': 10.0,
    'EX_k_e': 10.0,
    'EX_fe3_e': 10.0,
    'EX_so4_e': 10.0,
    'EX_fru_e': 5.0,
    'EX_nh4_e': 10.0,
    'EX_na_e': 10.0,
    'EX_o2_e': 18.5,
    'EX_mobd_e': 10.0,
    'EX_h2o_e': 1000.0,
    'EX_h_e': 100.0
    }

The solver will then analyze the network and find the optimal steady state flux from our input metabolites to biomass.

# run FBA analysis
solution = model.optimize()

# print solution summary, the status from the linear programming solver
print([solution, "status: ", solution.status])

# print top 10 forward and backward flux
fluxes = solution.fluxes.sort_values()
print(fluxes[0:10])
print(fluxes[len(fluxes)-10:len(fluxes)])

# quick summary of FBA analysis
print(model.summary())

# summary of energy balance
print(model.metabolites.atp_c.summary())

# summary of redox balance
print(model.metabolites.nadh_c.summary())

Visualization of results using Escher

To visualize simulation results such as those from FBA, it is extremely informative to overlay reaction (flux) data on top of a familiar metabolic map. The open source tool Escher can be used for this purpose. It can be used in a python session but just as well as an online tool. Three files are required that can be obtained from a standard SBML model.

Escher workflow

Example