Porthmeus / MeMoMe

Metabolic Model Merging - a semiautomated way to merge genome scale metabolic models
Apache License 2.0
0 stars 0 forks source link

Remove duplicate metabolites #53

Closed Porthmeus closed 2 months ago

Porthmeus commented 1 year ago

Some models have the same metabolite under different IDs in the model. This potentially causes one-to-many relationships when comparing two models, thus we would need to unify the metabolites in terms of the identifier. Thus a small function for comparison of the metabolites within one model would be needed to identify these metabolites.

bramnap commented 11 months ago

Question: Do we want to also flag reactions that only differ in 1 or 2 protons? Which could indicate different protonation states between the same metabolite entries.

I would not delete /replace them outright unless there is a 100% certainty that the two mets are identical.

Porthmeus commented 10 months ago

Generally: the metabolite duplication detection should be designed as a preprocessing step which can produce SBML files so that the user can have a clean model.

Porthmeus commented 4 months ago

Previous comments on other channels: Bram brought up a row of problems with the preprocessing:

  1. if duplicate reactions exist, which use the same metabolites, but different metabolite ids there are three cases which can occur.
    1. the reactions are identical → simply delete one
    2. the reactions are identical, but annotation, GPR and/or bounds differ - we decided on the following → annotations will simply be merged, GPR will be merged with an OR, bounds will be adjusted to the most liberal ones
    3. reactions are not identical, because protonation states differ → take the protonation state which is mostly prevalent in the models reactions. If there is a draw, take the first lexicographical one
  2. We decided to export the corrected model as SBML together with a log file, which describes the changes which have been made
  3. The whole preprocessing requires the cobrapy representation of the model in the cobramodel slot of the MeMoModel object - this needs to be implemented for loading the model from Path and SBML