opencobra / memote

memote – the genome-scale metabolic model test suite
https://memote.readthedocs.io/
Apache License 2.0
123 stars 26 forks source link

Memote warns about the existence of two biomass reactions, while there is only one. #675

Closed ChristianLieven closed 4 years ago

ChristianLieven commented 4 years ago

Opening on behalf of @anushcph

Problem description

It seems like memote looks for any reaction containing a metabolite or reaction name called “biomass” to identify biomass reactions. It might be a good idea to include other criteria like at least a number of biomass precursors as part of the reaction or the “biomass” metabolite should only be product in the reaction

draeger commented 4 years ago

Is the biomass reaction identified based on the ID or the name of the reaction? I'd suggest enforcing usage of the correct SBO term 629. In my opinion, Memote could even complain about the lack of a biomass reaction if that SBO term isn't used.

Midnighter commented 4 years ago

Our strategy for identifying biomass reactions is:

This function identifies possible biomass reactions using two steps:

  1. Return reactions that include the SBO annotation "SBO:0000629" for biomass. If no reactions can be identifies this way:
  2. Look for the buzzwords "biomass", "growth" and "bof" in reaction IDs.
  3. Look for metabolite IDs or names that contain the buzzword "biomass" and obtain the set of reactions they are involved in.
  4. Remove boundary reactions from this set.
  5. Return the union of reactions that match the buzzwords and of the reactions that metabolites are involved in that match the buzzword.

Memote does complain if the SBO term is missing in the respective SBO annotation section but we still consider the biomass reactions identified in this way. I think only relying on the SBO term is not productive at this point since only models coming from KBase and the majority of BiGG models use SBO terms at all.

anushchp commented 4 years ago

Thank you all very much for your answers! I should say first that I converted the model from .mat to .xml and some information associated with the metabolites and reactions annotations got lost in this process - I know I should fix this. However, based on the 2 reactions that were identified in my case, there might be a bug in the pipeline that Midnighter and Christian very nicely described to me. The two reactions are: (1) the real biomass reaction producing a metabolite named "Biomass", and (2) the exchange reaction for the "Biomass" metabolite. The step 4 in the identification process is not being applied here, right?

Midnighter commented 4 years ago

Indeed, there is a problem with the logic of the function that identifies biomass reactions. In some cases it returns too early before all of the above steps have been performed.

ChristianLieven commented 4 years ago

@anushchp Could you check-out #677 and test it with your model to see if it fixes the bug?