Open cdiener opened 2 weeks ago
Thanks again for posting the issue.
I tried to find the underlying problem but am not entirely there yet. What I found so far:
gapseq fill
module to set a minimum growth rate that gap-filling should achieve; yet, it was not enforced, and models with a growth rate less than the specified value are commonly returned. I fixed this with commit fb7fbcb. The example with MGYG000000522 now works, but the issue from the previous point remains.Do you mean the initial MIP is feasible but then the model with non-zero indicators is infeasible? We had similar issues with the gapfilling in cobrapy and what helped is to lower the integrality tolerance because the default will sometimes allow flux through zero indicator reactions otherwise. That happened with GLPK and CPLEX to us as well. Gurobi and HIGHS seem to be ab it less sensitive there.
Do you mean the initial MIP is feasible but then the model with non-zero indicators is infeasible?
Yes, just that in gapseq the initial problem is formulated as LP not as MIP (Eq. 1 in the gapseq publication).
We had similar issues with the gapfilling in cobrapy and what helped is to lower the integrality tolerance because the default will sometimes allow flux through zero indicator reactions otherwise. That happened with GLPK and CPLEX to us as well. Gurobi and HIGHS seem to be ab it less sensitive there.
That is good to know; thank you! I will try to lower tolerance in glpk and cplex.
Oh sorry, never mind then. Integrality tolerance has no effects on LPs, obviously.
But it was a good hint – there is a simplex parameter in GLPK that controls tolerance for bound violations. I guess cplex has this, too. A first quick test with GLPK indicates that this value might need to be reduced for the initial LP. I'll do some further tests.
The gapfilling algorithm was updated on the cobrar branch. The key was indeed to reduce the bound violation tolerance in cases where an optimal solution returned by the initial pFBA was not feasible and then re-run the optimization. The CPLEX documentation also recommends this procedure:
You can also lower this tolerance after finding an optimal solution if there is any doubt that the solution is truly optimal.
"cobrar" needs to be updated to the current development version to work with the latest version of gapseq from the cobrar branch,
I will run a larger set of reconstructions with the new updates to see if something unexpected happens.
Okay that makes sense. When we had similar problems in the CORDA port (somewhat similar idea with a linearized cost optimization) another gotcha was the threshold for the absolute flux value (`|v| > eps -> include). But it looks like you are already checking that by removing the reaction and ensuring the objective is maintained.
Just as an FYI. For me, the provided example will still fail gapfilling silently (final growth rate of 0) evne when updating to the latest master branch.
Yes, the master branch still relies on the sybil packages. Since the fix involved quite some changes in the gapfilling algorithm, I just made the changes to the "cobrar" branch of gapseq, which will be merged into the master branch hopefully soon.
Sorry I meant in relation to:
Unrelated to that, I noticed that we have an option "--min.obj.val" in the gapseq fill module to set a minimum growth rate that gap-filling should achieve; yet, it was not enforced, and models with a growth rate less than the specified value are commonly returned. I fixed this with commit fb7fbcb. The example with MGYG000000522 now works, but the issue from the previous point remains.
Even with commit fb7f MGYG000000522 did not work for me.
I will rebuild the image with the cobrar branch and test that one a bit more. Thanks for all the work on this!
Related to #228, opening a new bug since I can't reopen the old one and this is a new example.
We are still seeing failures in gapfilling for Archaea with version 1.3.1. I attached an example from MGnify here.
Environment
From here.
Commands
We use the gut medium bundled with gapseq. The script we run:
Which will result in:
MRP
See the input files here: uhgg_example.zip