opencobra / memote

memote – the genome-scale metabolic model test suite
https://memote.readthedocs.io/
Apache License 2.0
123 stars 26 forks source link

Codebase/tutorial for addressing output from each test #732

Open zoey-rw opened 2 years ago

zoey-rw commented 2 years ago

Checklist

Question

Is there a codebase with examples for how one might address each test in Memote, especially using the Memote functions corresponding to the tests? The documentation gives useful background info and references literature, but I imagine the actual code to manipulate models exists only in personal repos (?), but I'm hoping to avoid re-writing scripts if it's already been done and I just haven't found it. For example, to address the "Duplicate reactions" test:

import memote.support.basic as basic # import Memote functions

duplicated = basic.find_duplicate_reactions(model_in) # save to variable
duplicated = duplicated[0] # get first element (list of reactions)
to_remove = [x[1] for x in duplicated] # get only the second reactions
model_in.remove_reactions(to_remove) # remove those reactions

For context, I am also running an automated pipeline with CarveMe, and facing similar problems to those noted in #726 . Thanks for any guidance.

Midnighter commented 2 years ago

Hi @zoey-rw,

That's an excellent question! I'm afraid that I'm not aware of such a resource. The scope for the MEMOTE project was already pretty big so we decided to limit ourselves to detecting problems and not how to solve them. I agree that it would be an extremely helpful resource and it'd be lovely to be able to link to solutions in the test descriptions.

If you're up for it, you can start a repository for collecting such recipes. I don't have the time or energy for that since I work in a different context now. I will try to answer your questions here, though. Such a cookbook of MEMOTE recipes could turn into a useful community resource that others contribute to as well. In the end, it should be publishable in my opinion but I'm not a journal editor, of course. If you're serious about it, we can even host it under the opencobra organization.

zoey-rw commented 2 years ago

Thank you for the feedback! I've started a repository here: zoey-rw/metabolic_model_curation

Right now, it just has placeholders for tasks, and links to relevant scripts by @ChristianLieven and @SysBioChalmers. I'll be working with a few students this summer to flesh out this repository, so I welcome any help with identifying priority tasks and pointing to example code!

Currently, I am working on balancing reactions, but Memote identifies mass- unbalanced reactions that are already listed as "balanced" on MetaNetX, such as this reaction. I'm unsure if MetaNetX is evaluating both mass or charge balance, but would also appreciate suggestions on how to address reactions that can't be neatly fixed by adding protons.

famosab commented 2 years ago

Hi @zoey-rw :)

I also got into automated charge and mass correction. I found that the most tedious part is comparing entries in different databases and then the decision which entry is the most correct one. Sometimes for one metabolite I found several formulae and thus several possible charges. There exists no community guideline as far as I know that will tell you which version of a metabolite to use in your model. That leads to inconsistencies for example in the BiGG database.

I guess the best solution would be to give stricter rules / guidelines concerning the use of formulae for metabolites. At the same time we would need some kind of place to collect ambiguous metabolites.

Midnighter commented 2 years ago

The exact protonation of reactants and products can be quite hard to decide. I don't know your background so excuse me if I'm telling you obvious things. In the same compartment at a given pH, the same metabolite will likely exist in different protonation states with certain probabilities. A tool like equilibrator can give you estimates for those but we don't know for all reactions which form is the preferred one necessarily.

This is also why in the latest SBML version there is a move towards allowing non-integer formulae and charges to represent aggregate states approximately.

famosab commented 2 years ago

Thank you for the input. That is certainly the case. I am just wondering how this can be made a bit more transparent when looking at different models. For now, I feel like in most models there is one protonation state chosen but this rarely reported (often in such a way that stoichiometric consistency is reached - which also makes sense).

I am unsure how allowing non-integer formulae will fix this, it is a good first step though.

Midnighter commented 2 years ago

I am unsure how allowing non-integer formulae will fix this, it is a good first step though.

I believe the idea is that formulae and charges are weighted averages of all microspecies according to their probability and that both sides of a reaction should be approximately equal. If that actually works in practice, I cannot say.

Midnighter commented 9 months ago

Hi @zoey-rw,

How is it going with your project by the way? I haven't looked at MEMOTE in a long time myself.