Issue: No Rendered Output When Following MICOM Documentation for Constraint-Based Modeling

wtscott31 commented 4 months ago

Problem description

Hi

I have been following the published documentation on the MICOM webpage to create a proper workflow and model manifest for performing analysis and obtaining results. However, I am getting stuck with no rendered output. Below is the script I am using to perform constraint-based modeling with one sample at first (test case). If successful, I plan to extend this to three samples for a microbial community of twelve species, each represented by a genome-scale model and its respective relative abundance. I have provided placeholders for the taxonomy for now. I also included a medium file. However, I do not obtain any results as MICOM simply stalls.

Code Sample

import os
import pandas as pd
from micom.workflows import grow
from micom.workflows.build import build
from micom.media import minimal_medium
from micom.util import join_models, load_pickle, _read_model
from micom import Community
from cobra.io import read_sbml_model, write_sbml_model

# Define the base directory for model files and output
base_dir = "~/Gap-filled_MM2014_GEMs"

# Define the model paths relative to the base directory
model_paths = [
    "Dehalobacter_restrictus_new_3.xml",
    "Dehalobacter_sp004343605_new_3.xml",
    "Acetobacterium_sp003260995.xml",
    "Cloacimonadaceae_g__Cloacimonas.xml",
    "g__Methanothrix_s.xml",
    "Methanobacterium_C_congolense.xml",
    "o__Bacteroidales_f__TTA-H9_g__TTA-H9_s__TTA-H9_sp002418865.xml",
    "o__Bacteroidales_f__VadinHA17_g__LD21_s__.xml",
    "o__Bacteroidales_f__VadinHA17_g__SR-FBR_E99_s__.xml",
    "Rectinema_sp002441395.xml",
    "Rectinema_sp015657395.xml",
    "Rectinema_subterraneum.xml"
]

# Define the abundances
abundances = {
    "Dehalobacter_restrictus_new_3": 0.3673,
    "Dehalobacter_sp004343605_new_3": 0.0034,
    "Acetobacterium_sp003260995": 0.0266,
    "Cloacimonadaceae_g__Cloacimonas": 0.0258,
    "g__Methanothrix_s": 0.0244,
    "Methanobacterium_C_congolense": 0.0389,
    "o__Bacteroidales_f__TTA_H9_g__TTA_H9_s__TTA_H9_sp002418865": 0.0227,
    "o__Bacteroidales_f__VadinHA17_g__LD21_s__": 0.0021,
    "o__Bacteroidales_f__VadinHA17_g__SR-FBR_E99_s__": 0.0008,
    "Rectinema_sp002441395": 0.0063,
    "Rectinema_sp015657395": 0.0744,
    "Rectinema_subterraneum": 0.081
}

# Create placeholder taxonomic data
taxonomic_data = {
    "kingdom": ["Bacteria"] * len(model_paths),
    "phylum": ["Placeholder_Phylum"] * len(model_paths),
    "class": ["Placeholder_Class"] * len(model_paths),
    "order": ["Placeholder_Order"] * len(model_paths),
    "family": ["Placeholder_Family"] * len(model_paths),
    "genus": ["Placeholder_Genus"] * len(model_paths),
    "species": ["Placeholder_Species"] * len(model_paths),
}

# Create the full paths for the models
full_model_paths = [os.path.join(base_dir, path) for path in model_paths]

# Create a DataFrame with model IDs, file paths, abundances, and taxonomic information
taxonomy_df = pd.DataFrame({
    "id": [path.split('/')[-1].replace('.xml', '') for path in model_paths],
    "file": full_model_paths,
    "abundance": [abundances[path.split('/')[-1].replace('.xml', '')] for path in model_paths],
    "sample_id": [f"sample_{i}" for i in range(len(model_paths))],
    **taxonomic_data
})

# Ensure the output directory exists
output_directory = os.path.join(base_dir, "MICOM")
os.makedirs(output_directory, exist_ok=True)

# Debugging: Print the current working directory and full paths to verify correctness
print("Current working directory:", os.getcwd())
print("Full model paths:", full_model_paths)
print("Output directory:", output_directory)

# Check if all model files exist
for path in full_model_paths:
    if not os.path.isfile(path):
        print(f"File not found: {path}")
    else:
        print(f"File exists: {path}")

# Build the community models using the build function
manifest = build(taxonomy_df, model_db=None, out_folder=output_directory, cutoff=0.0001, threads=1, solver=None)

# Read the medium file into a pandas DataFrame
medium_file_path = os.path.join(output_directory, "edlab_mineral_medium_BiGG_IDs.csv")
medium_df = pd.read_csv(medium_file_path)

# Ensure the medium DataFrame has the correct columns
if 'reaction' not in medium_df.columns or 'flux' not in medium_df.columns:
    raise ValueError("Medium file must contain 'reaction' and 'flux' columns")

# Add sample_id column to medium DataFrame
medium_df['sample_id'] = taxonomy_df['sample_id'][0]

# Use the grow function with the medium DataFrame
res = grow(manifest, model_folder=output_directory, medium=medium_df, tradeoff=0.5, threads=1)

# Print results
print(res)

Expected Behavior

The grow function should complete and return results, but it currently stalls with no rendered output.

Context

Additional Context

Operating System: macOS Python Version: 3.9 MICOM Version: 0.35.0

Any guidance or suggestions to resolve this issue would be greatly appreciated.

Thank you!

cdiener commented 3 months ago

Hmm, definitely odd. A bit hard to diagnose without having all the files at hand. Can you specify what you mean by "stall"? How long is it running for? The solvers all have iteration limits so they have to finish at some point. Does that also happen with the included example data from the docs?

wtscott31 commented 3 months ago

Hi @cdiener

By "stall," I mean that the analysis runs for an extended period without reaching a solution, eventually terminating due to solver iteration limits or not finding a solution.

This issue does not occur when using the example data provided in the MICOM documentation.

I have sent you an email to reply to your question more thoroughly with the files. Hopefully, you can help me out.

Thanks,

wtscott31

cdiener commented 3 months ago

Sorry for the delay, I was on vacation. Just letting you now I got your files and am on it. Since your models seem to be fresh reconstructions, do those grow by default? Meaning if you just simulate the loaded models with open bounds do you get a decent growth rate?

wtscott31 commented 3 months ago

Hi @cdiener, no problem. Yes all models were generated using Carveme and are able to grow and when constrained under the medium conditions, the growth rates are roughly between 0.05 and 0.9.

cdiener commented 3 months ago

I sent you an E-mail with an updated notebook. The main issue were the IDs in your community medium which needed to end in _m. Fixing this everything grows when I tested it with the hybrid solver and cplex.

micom-dev / micom