Implement minimal MEMOTE scoring

mihai-sysbio commented 3 years ago

The pipeline should test the model with MEMOTE.

famosab commented 1 year ago

I started working on this based on the branch feat/cobra-load. I wrote a small function get_consistency which aims to use the consistency module from MEMOTE. This is just an example how to access smaller parts of the whole MEMOTE suite which might be a better approach for what we are trying to do. I am unsure how important the annotation status of a model is, while the consistency might be more interesting or the presence of energy-generating-cycles etc.

mihai-sysbio commented 1 year ago

NIce job @famosab!

I am unsure how important the annotation status of a model is

The annotation becomes highly relevant as soon as one wants to fetch data from a database based on the annotation, or when integrating omics datasets.

Going forward I think using the Docker image of opencobra/memote would be a better way to handle the Memote dependency. It also covers everything else there is in requirements.txt.

I'm now ollowing this suggestion to use the Memote test results in JSON format, to see if that is something compatible with the current setup.

famosab commented 1 year ago

Using the docker image seems to be useful. As far as I know there is a possibility to integrate the build within the github action, as described here.

It still seems to me that using the json output might help in sorting out which results we want to look at (which is nice) but memote would still be required to run all tests. Especially if we use it with docker. That means it will take some time per model. But that should be fine since we plan to run it as a cron job anyways.

Regarding the annotations: we could either check for all of them (which is quite costly) or we define the subset of databases which seem to be the most up-to-date. For example the BiGG database has not been updated since 2019 while the ChEBI database was updated on Feb 2 this year.

mihai-sysbio commented 1 year ago

Using the docker image seems to be useful. As far as I know there is a possibility to integrate the build within the github action, as described here.

This has been achieved through #9.

It still seems to me that using the json output might help in sorting out which results we want to look at (which is nice) but memote would still be required to run all tests. Especially if we use it with docker. That means it will take some time per model. But that should be fine since we plan to run it as a cron job anyways.

Regarding the annotations: we could either check for all of them (which is quite costly) or we define the subset of databases which seem to be the most up-to-date. For example the BiGG database has not been updated since 2019 while the ChEBI database was updated on Feb 2 this year.

The current implementation combines the consistency and annotation from memote in a single score:

_, results = memote.suite.api.test_model(model, None, True, None, {"basic", "annotation", "consistency"})
processed_results = memote.suite.api.snapshot_report(results, None, False)
results_json = json.loads(processed_results)
memote_score = results_json['score']['total_score']

mihai-sysbio commented 1 year ago

The MEMOTE tests used in validation has changed, and its revision is highlighted through a new issue #12, so that the current issue can be completed.

MetabolicAtlas / standard-GEM-validation

Implement minimal MEMOTE scoring #3