Why does MEMOTE give different numbers of energy generating cycle reactions?

hgscott commented 1 year ago

The first time I generated a MEMOTE report with the erroneous energy generating cycle tests not all being skipped (but my catch for if a metabolite was present or not wasn't working, so nothing skipped, just errored), I got that there were >150 reactions involved in ATP generating cycles.

Then, when I fixed that if statement bug (so metabolites skipped rather than erroring if they aren't present), I got that there were 84 reactions involved in ATP generating cycles: And I continued to get that when I ran my GitHub action several times.

Then I merged by ms2-model branch into main, and got >170 reactions:

And when I try to run MEMOTE locally (using the pip installable MEMOTE- not my branch, so I'm just using running the function once for ATP, I get a lot fewer reactions):

hgscott commented 1 year ago

Just as a sanity check, I made sure the model files were the same on the main branch and on the cycles branch, which was giving two different numbers in the MEMOTE report.

When I checked the diff, there was no difference in the model file:

hgscott commented 1 year ago

To be able to test this more easily (and not have to run GitHub actions for everything) I made a new conda environment called my-memote that runs MEMOTE (specifically and editable install from my local copy).

hgscott commented 1 year ago

I tried just using the consistency.detect_energy_generating_cycles function directly with this code:

import cobra
import memote.support.consistency as consistency

# Load the model
model = cobra.io.read_sbml_model("model.xml")

met = "MNXM3"  # ATP

print("Finding ATP generating cycles")
print("----------------------------")
# For a specfied number of times
for i in range(0, 10):
    # Make a copy of the model
    model_copy = model.copy()
    # Run just the function for finding ATP generating cycles
    rxns = consistency.detect_energy_generating_cycles(model_copy, met)
    # Print the results
    print(rxns)

But each run gave the same results:

hgscott commented 1 year ago

Then I ran the whole snapshot generation command 10 times with this code:

import subprocess

# Loop the number of times you want to run memote
for i in range(0, 10):
    # Define the output file name
    output_file = "report_" + str(i) + ".html"

    # Define the command as a list of strings
    command = [
        "memote",
        "report",
        "snapshot",
        "--filename",
        output_file,
        "model.xml",
    ]

    # Run the command using subprocess
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    # Wait for the process to finish
    stdout, stderr = process.communicate()

And got a whole range of values (all much higher than 3):

152
198
160
160
164
160
160
84
164
144

hgscott commented 1 year ago

Daniel say to:

Do some debugging/print statements, to figure out how exactly the HTML report is being generated. Is the Erroneous energy generating cycles section really using the function I think it is?

hgscott commented 1 year ago

For easier debugging I had to add a few things to various__init__.py files to be able to actually run the snapshot function from python:

Import the cli module as a part of MEMOTE:

And import reporting as a a part of the cli module (has to go after the CONTEXT_SETTINGS) definition to avoid circular imports):

Then I could generate the snapshot report HTML with:

import memote

memote.suite.cli.reports.snapshot(["../GEM-repos/mit1002-model/model.xml"])

When I put debug breakpoints I verified that the snapshot function was doing what I thought it was, but that the majority of reactions found in the detect_energy_generating_cycles function had a tiny flux (10^-48 tiny).

When I increased the flux > 0 to flux > 0.001, I got the same three reactions I got when I ran just the detect_energy_generating_cycles function locally.

Not sure how that gave different results when running the snapshot function versus running just that one function, but I think that may just be down in the actual CS weeds of how computers handle large numbers.

hgscott commented 1 year ago

I changed the code to use the TOLERANCE_THRESHOLD variable which was already set and being used in other consistency checks:

C-CoMP-STC / GEM-mit1002

Why does MEMOTE give different numbers of energy generating cycle reactions? #41