SysBioChalmers / yeast-GEM

The consensus GEM for Saccharomyces cerevisiae
http://sysbiochalmers.github.io/yeast-GEM/
Creative Commons Attribution 4.0 International
95 stars 46 forks source link

References for anaerobic benchmarking #325

Closed mperisin-lallemand closed 2 months ago

mperisin-lallemand commented 2 years ago

Description of the issue:

I am looking to benchmark the model under anaerobic carbon limited conditions. I noticed that the main README now features a plot of in silico vs. experimental growth rates for combinations of C and N limitation in aerobic and anaerobic conditions (https://github.com/SysBioChalmers/yeast-GEM/blob/main/growth.png). Are there available reference datasets corresponding to these conditions (both flux tables for the simulations, and growth/metabolite measurements for the experiments)? Are there other references you can recommend for benchmarking the model under anaerobic conditions?

I hereby confirm that I have:

edkerk commented 2 years ago

The graph has been shown in previous publications (e.g. yeast8 Supplementary Figure 4C), and is generated by this function. It uses this data, which was gathered in Österlund et al. 2013. Under "Methods" - "In silico growth simulations" it cites to the source of those values, evenwhile it does not explicitly mention to the precise paper in its Supplementary file 5.

There might be more recent (post-2013) articles with relevant chemostat data, although anaerobic conditions are not as commonly studied, as you probably have also noticed. We'd be happy to enrich our benchmarking dataset with any additional datapoints though!

mperisin-lallemand commented 2 years ago

Thanks for the quick reply. It appears that the plot on the README cuts off the C-limited anaerobic point with experimental growth rate of 0.369 h^-1 likely due to line 47 in https://github.com/SysBioChalmers/yeast-GEM/blob/main/code/modelTests/growth.m lim = max(exp_max,mod_max)+0.05;. Do you have any ideas why the model consistently overestimates growth rates for C-limited anaerobic conditions?

edkerk commented 2 years ago

You're right that one point is no longer on the graph, the function is indeed sloppy to define the axes (it worked for yeast8 Supplementary Figure 4C:
image but the function hasn't been changed since then, and it clearly is not suitable anymore.

We'll change https://github.com/SysBioChalmers/yeast-GEM/blob/3be6d1cb4e65e7245ed58acca904b8222a3abae1/code/modelTests/growth.m#L45-L46

to the following in the next yeast-GEM release:

exp_max = max(exp_data(:,4));
mod_max = max([mod_data1(:,4);mod_data2(:,4);mod_data3(:,4);mod_data4(:,4)]);

But looking back at the yeast8 paper (figure above), it seemed like the C-limited anaerobic predictions have worsened since then (it was already overpredicted then, but not as drastically). But I'm not certain what model version was used to make the figure in the paper, would be interesting to trace back where it worsened. More general, a worse fit to anaerobic suggest that the growth associated energy requirement (GAM) is different between aerobic and anaerobic. Perhaps some biosynthetic pathway or macromolecule polymerization will use a non-optimal variant when no oxygen is available? Or it's a modelling artefact, hard to know a priori.

mperisin-lallemand commented 2 years ago

I am attempting to recreate the in silico vs experimental growth rate plot and I am confused by the rescaling of biomass under N-limited conditions in the growth.m script: https://github.com/SysBioChalmers/yeast-GEM/blob/3be6d1cb4e65e7245ed58acca904b8222a3abae1/code/modelTests/growth.m#L74-L78

After these modifications there is a function call on line 80 for anaerobic conditions, which will re-scale the biomass further and undo the protein modification: https://github.com/SysBioChalmers/yeast-GEM/blob/3be6d1cb4e65e7245ed58acca904b8222a3abae1/code/modelTests/growth.m#L79-L81

Is there a reference for these changes?

mperisin-lallemand commented 2 years ago

I see that the anaerobicModel.m script references Nissen et al. 1997. https://github.com/SysBioChalmers/yeast-GEM/blob/3be6d1cb4e65e7245ed58acca904b8222a3abae1/code/otherChanges/anaerobicModel.m#L11-L14 However, in the methods for that paper, the authors claim: "The composition of protein, DNA, RNA and lipids is assumed to be constant under all growth conditions. This assumption is supported by measurements of the amino acid composition of the protein under various growth conditions. Whereas the amino acid composition of the protein was measured (data not shown), we use the composition of nucleotides in RNA and DNA obtained by de Robichon-Szulmajster & Surdin-Kerjan (1971) and the lipid composition described by Rattray (1988). The unsaturated fatty acids and the sterols are assumed to be supplied by respectively the Tween 80 and the ergosterol content of the medium."

Further in the results: "In this study, the cellular composition was therefore determined at four different dilution rates (see Table. 4). The most important variation in the cellular composition is that the amount of active machinery, i.e. protein and RNA, increases linearly with increasing dilution rate at the expense of carbohydrates. The cellular content of other components is virtually independent of the dilution rate."

edkerk commented 2 years ago

There are two different changes on biomass:

  1. Scaling the ratio of macromolecules, as N-limited conditions have somewhat different levels (proteins most significant: 29% (ref) instead of 46% (ref).
  2. Removing some metabolites (heme, NAD(P)(H), ergosterol) from the biomass equation, as they cannot be produced in anaerobic conditions (NAD(P)(H) is of course not realistic, the model should really be able to produce it, but we haven't identified where the problem lies). But, in this change, the rest of biomass is not scaled, so it does not overwrite the changes made in 1.

These biomass composition changes are all introduced in the yeast8 paper.

edkerk commented 2 years ago

Regarding your second comment: biomass composition is indeed not scaled as an effect of anaerobiosis, only as an effect of N- vs. C-limitation.

And indeed, biomass composition varies somewhat anyway, dependent on growth rate, media, cultivation conditions, stand of the moon. But the most significant change would be the reduction in protein content during N-limitation (which should be compensated for by something, and we assume this is carbohydrates), and the complete inability to synthesize certain metabolites (e.g. ergosterol).

mperisin-lallemand commented 2 years ago

Thanks again for the lightening fast responses! I am still confused by line 17 in anaerobicModel.m. https://github.com/SysBioChalmers/yeast-GEM/blob/3be6d1cb4e65e7245ed58acca904b8222a3abae1/code/otherChanges/anaerobicModel.m#L17 This function does not change the biomass composition for anaerobic conditions?

edkerk commented 2 years ago

Mea culpa, I completely missed that line. You're right, it indeed reverts the changes that were earlier introduced for N-limitation. Quick fix is to just swap these biomass-modifying functions around in growth.m: first address anaerobiosis, then N-limitation.

But, this also draws me to what is likely the cause of/contributing to the problem: https://github.com/SysBioChalmers/yeast-GEM/blob/3be6d1cb4e65e7245ed58acca904b8222a3abae1/code/otherChanges/anaerobicModel.m#L12-L16

The GAM is changed! In the yeast-GEM.xml model, the GAM is fitted at 55.4 since #159 (release 8.3.1), using the Van Hoek 1998 chemostat data. This GAM is used for the "normal" (=aerobic) simulations, for the anaerobic simulations the GAM is changed to 30.49. Lower GAM would mean less energetic cost for the model: overprediction of growth.

Meanwhile, I cannot find (after a quick look at the article) the 30.49 number in Nissen et al. 1997, although its Table 3 seems to suggest 54.32 instead? Not sure, because it also suggest widely different values for different carbon sources, which is counterintuitive to me.

mperisin-lallemand commented 2 years ago

Yes! The GAM modification makes a huge difference. I re-created the experimental vs in silico growth rate plot with yeast 8.6.0 and removed all GAM and biomass composition modifications. Here are plots before and after these changes. yeast8 6_exp_vs_sim_growth_rates yeast8 6-nogammod_exp_vs_sim_growth_rates The N-limited aerobic in silico growth rates are different, so perhaps the biomass composition adjustment is necessary for this condition.

edkerk commented 2 months ago

Work on an updated anaerobic model are discussed in #352