opencobra / memote

memote – the genome-scale metabolic model test suite
https://memote.readthedocs.io/
Apache License 2.0
125 stars 26 forks source link

Error in Non-Growth Associated Maintenance Reaction #652

Open matthiaskoenig opened 5 years ago

matthiaskoenig commented 5 years ago

Problem description

When creating the memote report on the attached model I get an error in Non-Growth Associated Maintenance Reaction tiny_example.zip

Code Sample

memote report snapshot --filename tiny_example_11_memote.html tiny_example_11.xml

Context

System Information ================== OS Linux OS-release 4.15.0-46-generic Python 3.5.2 Package Versions ================ Jinja2 2.10.1 click 6.7 click-configfile 0.2.3 click-log 0.3.2 cobra 0.15.1 cookiecutter 1.6.0 depinfo 1.5.1 equilibrator_api 0.1.26 future 0.17.1 gitpython 2.1.11 goodtables 1.0.0 importlib_resources 1.0.2 lxml 4.3.3 memote 0.9.8 numpydoc 0.8.0 pandas 0.23.4 pip 19.0.3 pylru 1.2.0 pytest 4.4.0 python-libsbml 5.17.0 requests 2.21.0 ruamel.yaml 0.15.91 setuptools 41.0.0 six 1.12.0 sqlalchemy 1.3.2 sympy 1.3 travis-encrypt 1.1.2 wheel 0.33.1
ChristianLieven commented 5 years ago

Identifying compartments seems to be the issue:

____________________________________________________________________________ test_ngam_presence ____________________________________________________________________________
../../Dev/memote/memote/suite/tests/test_basic.py:236: in test_ngam_presence
    ann["data"] = get_ids(basic.find_ngam(model))
../../Dev/memote/memote/support/basic.py:130: in find_ngam
    helpers.find_met_in_model(model, "MNXM9", id_of_main_compartment)[0]
../../Dev/memote/memote/support/helpers.py:747: in find_met_in_model
    "namespace.".format(compartment_id, mnx_id))
E   RuntimeError: It was not possible to identify any metabolite in compartment c corresponding to the following MetaNetX identifier: MNXM9.Make sure that a cross-reference to this ID in the MetaNetX Database exists for your identifier namespace.

Are you using any compartment identifiers that cannot be mapped through MetaNetX?

matthiaskoenig commented 5 years ago

Are you using any compartment identifiers that cannot be mapped through MetaNetX?

I don't understand what this sentence means. I am not mapping anything to MetaNetX, i.e., no annotations to MetaNetX. Basically one should be able to use whatever compartment identifiers without tests failing.

ChristianLieven commented 5 years ago

I'm sorry I just realized I made a mistake there. The problem isn't with the compartments but with memote failing to identify the metabolite MNXM9 in the model: https://www.metanetx.org/chem_info/MNXM9). To identify a specific metabolite memote searches all related IDs in a pre-defined mapping table based on a small MetaNetX dump: https://github.com/opencobra/memote/blob/develop/memote/support/data/met_id_shortlist.json

Basically one should be able to use whatever compartment identifiers without tests failing.

Unfortunately, it is not that simple. For most of the tests memote doesn't care what identifiers people are using, but for some test (i.e. the NGAM test) we have to be able to identify specific metabolites (i.e. ATP, ADP, Pi, H2O, H+) in a specific compartment (i.e. the cytosol). In your specific case, we're actually assuming that whatever is the largest compartment is the cytosol and then search for the NGAM reaction there.

Just FYI, for compartments, we have an internal dictionary to help with the mapping (https://github.com/opencobra/memote/blob/955b30870e8f594108e7400e326221f7b36bc141/memote/support/helpers.py#L55) but will also switch to a MetaNetX based system in the future (i.e. https://www.metanetx.org/comp_info/MNXC3)

matthiaskoenig commented 5 years ago

So something is wrong with this functionality of finding MNXM9. In the predefined mapping one has the following chebi ids for MNXM9

MNXM9":{
    "bigg":["pi"],
    "chebi:["18367","14791","45024","7793","26078","35780","39745","29137","39739","43474","29139","43470"],...

And when plotting the metabolites of the model via

print("")
print("Metabolites")
print("-----------")
for x in model.metabolites:
    print('%9s (%s) : %s, %s, %s' % (x.id, x.compartment, x.formula, x.charge,  x.annotation))

I clearly have a metabolite phos with the correct chebi id for phosphate in the model: CHEBI:43474

Metabolites
-----------
      glc (c) : C6H12O6, 0, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00031', 'chebi': 'CHEBI:4167'}
      g6p (c) : C6H11O9P, -2, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00668', 'chebi': 'CHEBI:58225'}
      atp (c) : C10H12N5O13P3, -4, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00002', 'chebi': 'CHEBI:30616'}
      adp (c) : C10H12N5O10P2, -3, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00008', 'chebi': 'CHEBI:456216'}
     phos (c) : HO4P, -2, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00009', 'chebi': 'CHEBI:43474'}
   hydron (c) : H, 1, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00080', 'chebi': 'CHEBI:15378'}
      h2o (c) : H2O, 0, {'sbo': ['SBO:0000247', 'SBO:0000247'], 'kegg.compound': 'C00001', 'chebi': 'CHEBI:15377'}

It looks like the MetaNetX dump has incorrect chebi identifiers (which are not matching the official pattern), consequently string matching won't work. See https://www.ebi.ac.uk/miriam/main/collections/MIR:00000002 The identifier pattern for CHEBI is ^CHEBI:\d+$.

I.e. the metanetx table must be updated to

MNXM9":{
    "bigg":["pi"],
    "chebi:["CHEBI:18367","CHEBI:14791","CHEBI:45024","CHEBI:7793","CHEBI:26078","CHEBI:35780","CHEBI:3CHEBI:9745","CHEBI:29137","CHEBI:39739","CHEBI:43474","CHEBI:29139","CHEBI:43470"],...

I.e. all chebi identifiers are incorrect in the table and correctly annotated metabolites will not be found.