COBREXA / COBREXA.jl

COnstraint Based Reconstruction and EXascale Analysis (in Julia)
https://cobrexa.github.io/COBREXA.jl/stable/
Apache License 2.0
9 stars 0 forks source link

Gap filling produces nonempty result when filling a model that is already feasible #78

Open oxinabox opened 1 week ago

oxinabox commented 1 week ago

I was showing someone that if you give gapfilling a model that is feasible already, even if you change the objective to be something that can be improved by knocking in some genes, it still won't say to insert them. To do this i found 10 genes i could KO that would decrease CO2 production, and then set objective to maximise CO2 production (which is kinda nonsense to do but anyway) and ran gap-filling to my surprise it actually decided to fill in something. But it wasn't even something I had knocked out (though it was something that had fixed bounds, so knocking in an extra copy accelerates it)

In contrast, when I run this exact same investigation in Cobrapy (I can provide code for that if you want) this does not occur and as expected it return an empty set.

There may be some interaction with having basically identical model and universal model, as I did not see this when i used metanetx 's largest universal model (which probably has this reaction).

Minimal code example to reproduce the problem

using my fav yeast-9: https://raw.githubusercontent.com/SysBioChalmers/yeast-GEM/refs/heads/main/model/yeast-GEM.xml

using COBREXA
using SBMLFBCModels
using HiGHS
using AbstractFBCModels:AbstractFBCModels, reaction_gene_products_available, reactions

function ko_model!(model, knockout_genes)
    for reaction_id in reactions(model)
        can_run = reaction_gene_products_available(model, reaction_id, !in(knockout_genes))
        #@show can_run
        if !something(can_run, true)
            reaction = model.reactions[reaction_id]
            @info "KOing" reaction.name
            reaction.lower_bound = reaction.upper_bound = 0.0
        end
    end
    return model
end

function set_product_objective!(model)
    for reaction in values(model.reactions)
        uris = get(reaction.annotations, "RESOURCE_URI", String[]) 
        reaction.objective_coefficient = "https://identifiers.org/bigg.reaction/EX_co2_e" ∈ uris
    end
    return model
end

data_dir = joinpath(dirname(@__DIR__), "data")
yeast_file = joinpath(data_dir, "yeast-GEM.xml")
const original_model = load_model(yeast_file)
const model = convert(AbstractFBCModels.CanonicalModel.Model, original_model)  # so we can mutate it directly

ko_genes = ["YLR028C", "YBR111C", "YOR125C", "YNL038W", "YJL134W", "YGR202C", "YGR055W", "YER152C", "YJR040W", "YGR243W"]
ko_model!(model, ko_genes)
set_product_objective!(model)

x = gap_filling_analysis(
    model,
    original_model,
    0.05,
    optimizer = HiGHS.Optimizer,
)

Expected result

It should not try and fill anything, because the model is already feasible

ConstraintTrees.Tree{Float64} with 6 elements:
  :fill_flags            => ConstraintTrees.Tree{Float64}(#= 4130 elements =#)
  :n_filled              => 0.0
  :....

Actual behavior

It tries to fill 1 reaction

ConstraintTrees.Tree{Float64} with 6 elements:
  :fill_flags            => ConstraintTrees.Tree{Float64}(#= 4130 elements =#)
  :n_filled              => 1.0
  :....

you can see which with:

@show any(!=(0), x.fill_flags)  # this should be false
cuni = convert(AbstractFBCModels.CanonicalModel.Model, original_model)
r = only([cuni.reactions[string(k)] for (k,v) in x.fill_flags if v!=0])

What was the unexpected thing that happened instead?

The extra surprising thing is it didn't even put in one i deleted. It put in one that was already there, and which doesn't have a gene associated

AbstractFBCModels.CanonicalModel.Reaction(
  name = "non-growth associated maintenance reaction",
  lower_bound = 0.7,
  upper_bound = 0.7,
  stoichiometry = Dict("M_s_0803" => -1.0, "M_s_0434" => -1.0, "M_s_0394" => 1.0, "M_s_0794" => 1.0, "M_s_1322" => 1.0),
  objective_coefficient = 0.0,
  gene_association_dnf = nothing,
  annotations = Dict("sbo" => ["SBO:0000630"], "RESOURCE_URI" => ["https://identifiers.org/bigg.reaction/ATPM"]),
  notes = Dict("" => ["<notes>\n  <body xmlns=\"http://www.w3.org/1999/xhtml\">\n    <p>Confidence Level: 0</p>\n  </body>\n</notes>"]),
)
exaexa commented 1 week ago

Hi! Thanks for the detailed reproducer, this indeed looks weird.

If I get it right, the reaction that gets gapfilled is the ATP maintenance or something similar? If that is so, it gets gapfilled literally because it is constrainted to strictly nonzero flux, and if you wouldn't include it in the fill, it would have a zero value in the model, making the model infeasible. This is the outcome of how the problem is encoded into a MILP, and of the fact that original bounds in the universal reactions are retained.

If the above is the case: maintenance pseudoreactions should not be included in universal reactions (they are more like extra constraints encoded as reactions), and removing them will solve the issue.

I recall we have this removal of ATPM in at least one of the examples. If it is not in the gapfilling one, we should certainly add it there. Maybe together with a mild warning about this possible issue.

If the reason is different then please do elaborate, because that would be very weird.

exaexa commented 1 week ago

PS re why cobrapy works: they might either auto-ignore these reactions, or use a different encoding into MILP that also selectively ignores the constraints, or fix/remove the bounds. Will have a look.

oxinabox commented 1 week ago

This is indeed ATPM. It is mentioned removing it in example but no details as to why. Maybe we need to add those details to the docs or have a function or a fag to remove this kind automatically?

exaexa commented 3 days ago

automated cleaning is kinda desirable so I'll continue with this within a greater scope of #81