MetabolicAtlas / standard-GEM

The standard for open-source GEMs on GitHub
https://www.biorxiv.org/content/10.1101/2023.03.21.512712
Creative Commons Attribution 4.0 International
18 stars 5 forks source link

Make the model used by both cobrapy and cobra toolbox? #16

Closed hongzhonglu closed 3 years ago

hongzhonglu commented 4 years ago

It is know some GEMs can only run on the cobra toolbox based on MATLAB, but it can't be read and used by another popular toolbox-cobrapy. Thus, it should make sure that the GEMs can be used by both MATLAB or cobrapy. Otherwise the test result should be given which toolbox can read and run the GEMs.

edkerk commented 4 years ago

As long as the model is valid SBML, should it then not work in both COBRA and cobrapy? What would be an example of when this fails?

hongzhonglu commented 4 years ago

I remember there is a GEM, which can be read by COBRA toolbox, but can't be read by cobrapy. I need firstly read it using COBRA and export a new xml file. The new xml file then can be read by cobrapy. I will further test some examples and give you feedback later.

draeger commented 4 years ago

The advantage of using SBML as a standard file format for encoding GEMs is those valid models are agnostic of any particular software solution. We should better encourage modelers to ensure the validity of their SBML file to benefit from the ability to conduct a large variety of different analyses with the same model in several modeling tools, not limited to any particular software.

Tightly coupling a model to a specific software tool can face two severe problems:

  1. Even very popular software has faced a situation where its development slows down or could not be maintained for some time. This process could lead to a situation where the software becomes outdated and the model possibly with it.
  2. The software could also quickly evolve, and an older version that was able to run the model could become replaced by a newer version. If a model requires a particular version of a software tool, it might become difficult with time to run it on newer hardware or later operating systems.

For these reasons, storing a model in a valid SBML file is preferable. There are tools to check the validity of a model, such as the SBML validator or MEMOTE. Those can be even used as part of a continuous integration setup of model development and testing.

Please note that the SBML Test Suite now includes constraint-based models, allowing software developers to assess if their software is compliant. Consequently, valid SBML models should run in any standard-compliant software.

mihai-sysbio commented 4 years ago

There is an automated-validation branch that relies on GH Actions to run different tests on latest releases of models that follow standard-GEM. I've just made a note there about SBML testing. Basic YAML testing is in place, but it will take a while until the validation setup is nice and thorough. We should continue to add ideas for validations there.

sulheim commented 4 years ago

I am not sure how closely this relates to the issue, or even if it is still an issue with the latest versions of cobrapy and cobratoolbox, but I often encounter models with metabolite names that are read differently by these two softwares, usually the appended '_c' or [c] that denotes the compartment. Is there a best practice that is bulletproof for both softwares?

Midnighter commented 4 years ago

@sulheim both programs still convert identifiers upon reading models which is not ideal but also somewhat necessary because folks have taken to using identifiers that are not SBML compliant. In cobrapy, we have been discussing not changing the identifiers read from SBML at all but additionally providing a 'nice' identifier that users can work with most of the time. However, this is a long way down the development roadmap.

mihai-sysbio commented 4 years ago

@sulheim thanks for bringing up this issue regarding compartment abbreviations in metabolite names. Do you think there is a test case that can be added to the automatic validation that is part of standard-GEM #9 ?

draeger commented 3 years ago

If there is no issue on identifier overloading with semantic information yet, please let us create one and continue discussing compartment abbreviations as metabolite identifier suffixes in this different thread.

mihai-sysbio commented 3 years ago

Good idea @draeger . There is now a note in #9 about the metabolite identifier suffixes. I am now closing this issue, please reopen if needed.