EPFL-LCSB / pytfa

A Python 3 implementation of Thermodynamics-based Flux Analysis
https://lcsb.epfl.ch/
Apache License 2.0
37 stars 26 forks source link

Integration with eQuilibrator #36

Open carrascomj opened 4 years ago

carrascomj commented 4 years ago

As an user, I would like to use the eQuilibrator-API as the thermodynamic database. Thus, eQuilibrator would parse the reactions, calculate ΔGr for each reaction and use the data to prepare and convert a cobra.Model into a pyTFA.ThermoModel. Is there any reason that hampers this integration?

I would be happy to help with the implementation, maybe as an extra dependency. I have used the tutorial in tutorials/figure_paper.py to try to reproduce the results just by adding reaction data, but the results differ (at the end of this notebook).

EDIT: after adding the pH and ionic strength information about each metabolite (without running the built-in .prepare() method), the results are exactly reproduced.

psalvy commented 4 years ago

Good day @carrascomj !

I am happy to see you are interested in pyTFA. This is definitely a functionality we have considered, and it would considerably expand the usability of pyTFA. However, we don't really have anyone that could work on it at the moment, so your effort is very welcome !

For the implementation, if you decide to write a standalone module we will gladly link to it, and if instead you would like to contribute directly to the code we can work with pull requests :)

Very cool notebook also, thank you for sharing it. And indeed, yes, pH and Ionic strength are extremely important for the models to be accurate. I am happy to see you could reproduce our results !

It's been a while since I have used eQuilibrator, so maybe things changed, but also it is important to make sure we recover the ΔGr'°, and not the ΔGr'm.

carrascomj commented 4 years ago

Nice! I will fork the project and will try to implement it in the package.

Just a quick question: as the eQuilibrator-API provides calculation of Confidence intervals, would it be meaningful to use them as the error for ΔGr'?

psalvy commented 4 years ago

Yes, we use the errors as bounds for the ΔGr'°, so we can directly integrate them!

carrascomj commented 4 years ago

Hello! After this PR is merged to equilibrator_api, I should be able to open mine for pytfa.

Meanwhile, I have stumbled upon some decisions to take:

  1. The logic is implemented as a new method .prepare_equilibrator(). I had to arrange the .__init__() to make it possible for the user to initialize the ThermoModel without specifying a thermo_data argument. Other possibility would be to implement it as an external function instead of a method. Here, the problem is that the API exposes the thermodynamics information of the reactions, not the formation energies, so the preparation of the model requires some tweaking of the ThermoModel in the process.
  2. Lazy loading of eQuilibrator. The eQuilibrator package always takes time to load, so I thought it would be interesting to just load it when the user tries to prepare the model with eQuilibrator.
  3. equilibrator_api as extra dependency? Right now, I have implemented it so it's only installed with
   pip install pytfa[equilibrator]
  1. Minimally refactored the assessment of reaction compartment to utils.py, since both "preparations" of reactions call it.
  2. Added some tests to the related methods.
psalvy commented 4 years ago

Good day Jorge, Wow, that was fast! Ok, regarding your points:

  1. I initially thought to have it as an external function, but I actually like your method. I was also wondering, maybe we can simply generate a thermo_data object, and feed this as a standard thermo_data dict ?
  2. Yes I think this is a must
  3. If you mean as an optional dependency, yes, I think it makes the most sense. Then I can simply add in the readme and docs this info
  4. I'll need more details here but yes I understand why this can be necessary
  5. Thanks! This is also a must, and yet so many people forget about this haha!

Thank your for being so on top of this Cheers, Pierre

carrascomj commented 4 years ago

Hi! After talking about it on the equilibrator_api side, I will try to implement it by reconstructing the thermo_data structure with formation energies from eQuilibrator, which might provide a cleaner integration, as an external function.

However, using raw formation energies from eQuilibrator may yield very high uncertainties. If they turned to be too high to be informative, I will need to go back to the proposed alternative .prepare_equilibrator() method, relying on reaction data.

psalvy commented 4 years ago

Good day Jorge, I see your point. In the current TFA we usually have formation energies decomposed by groups from the group contribution method. This allows to reduce the error when calculating the Gibbs energy of reaction by only accounting for the elements that change during the transformation. If we can get the components from the CCM from eQuilibrator, given a metabolite, maybe we can do the same?

carrascomj commented 4 years ago

Hello Pierre,

As explained in the PR, eQuilibrator uses the Component Contribution method instead of group contributions. Thus, major profound changes would be needed in the package to accommodate this information from eQuilibrator. The solution from the eQuilibrator prepared tmodel varies from but it is more similar to the cobra model than using the .thermodb file.

I have used the total summation of fluxes to quickly compare them:

cobrapy -> 143.13836864239119
thermo_data from equilibrator -> 188.8536839871894
thermo_data from .thermodb -> 1190.6526819344149