Closed jonstrutz11 closed 5 years ago
Can COBRA load SBML files? If so it would be great to get a translation layer developed.
See the discussion at https://github.com/opencobra/COBRA.jl/issues/13 . Specifically, https://stochasticreveller.wordpress.com/2016/08/02/sbml-and-the-julia-programming-language/ would make a good package. Then it would be up to us to support using it. That would make for a good GSoC project, but I am not sure if I can attract a student to it.
Getting SBML support would be great, but it poses two big challenges:
That seems like a task for a full-time programmer. I imagine programs like Bionetgen and Copasi that fully support SBML are funded by large NIH grants that support hiring such programmers.
@jonstrutz11 Is your stoichiometric matrix just for fundamental reactions (i.e. no Hill expressions or such)? If so it might be possible to write a converter to import it into a reaction network.
One thing we should explore is the structure of the ODE and SDE functions we currently generate. Right now I think it is just a single monolithic function, but I wonder if instead calling a mass action reaction evaluation function as we do for jumps might offer better performance for large reaction networks. @ChrisRackauckas any thoughts?
Building a single function is always going to be the most performant. The mass action reactions win over the other normal jumps because they get rid of a function call that cannot easily be inlined.
But it wouldn't be difficult to have a non-macro interface for stoichometry matrices that generates the functions for the given problems just via some linear algebra. The only difficult part would be to generate a system of mass action jumps from it, but really that's not too bad.
What about if one has several thousand species and tens of thousands of reactions? Could the ODE rhs function size lead to cache issues? You're probably right that it wouldn't make a difference, but having never coded up a single function that long I don't really know what effect it could have.
Even if one puts all the rate evaluations directly in the generated ODE rhs function, could there be an advantage to not evaluating rate terms multiple times in different ODEs, as currently done, but instead looping through the reactions, evaluating their rate, and adding that contribution into the corresponding du
entries?
Yes, so I would be loading an SBML model and converting each reaction into a series of elementary reactions (using my own code). At that point, I could use a tool like cobrapy to create a stoichiometric matrix describing all of those elementary reactions, which I need to convert to an ODEProblem (assuming mass action kinetics).
It seems like COBRA.jl cannot take in SBML files, but I know that cobrapy can be used to load SBML files and can also generate a stoichiometric matrix.
We might need more than just rate constants and stoichiometric matrix; the latter is not sufficient to determine the rate law, right? For example if we have A+B --> A + C then the my understanding is the corresponding column of the "stoichiometry matrix" would be [0,-1,1]
, but the rate law needs A*B. So we probably need as input separate matrices with the reactant stoichiometry [1,1,0]
, the product stoichiometry [1,0,1]
, and the rate constants. Given this info we could formulate everything.
Alternatively, you can just generate a string from your reactions that encodes the DSL. I've used this to take text files with tens of thousands of reactions in a minimal format, translate them to a string that is the corresponding DiffEqBiological DSL model, and then evaluate the string to create a reaction_network
. The parsing and creation of the base reaction_network
is pretty fast (at most a few seconds). What is rate limiting is assembling the ODE/SDE/jump functions (hence why we created the new min_reaction_network
macro that lets one assemble those incrementally as they are needed).
Yeah that is right - I didn't think about reactions with the same metabolite on each side. So actually, using an S-matrix wouldn't work as well as I thought. Thanks for catching that!
I originally thought to do the alternative option as you suggest, but thought it might be easier to go from the S-matrix directly, if possible. However, after this discussion, it now seems that just generating a string that encodes the DSL might be the easier way forward.
If you look at my ParseRxNetwork package you can see how to generate a reaction_network
from a simple text file of the form:
S1 -> S3 , 1.82e-4
S3 -> S2 , 9.1e-5
S2 -> S3 , 1.5e-3
S3 -> S1 , 7.5e-4
S3 + S6 -> S7 , 1.53e-8
S2 + S6 -> S8 , 3.06e-8
S4 + S6 -> S9 , 5e-5
...
Great! Thank you!
I'm closing this as ReactionNetworkImporters.jl now supports constructing a reaction_network
from substrate and product stoichiometric matrcies (dense or sparse).
DiffEqBiological.jl makes it easy to go from a reaction network (defined using the DSL) to an ODEProblem.
However, large biological models (100s of reactions +) are often encoded in a more convenient format such as SBML. That said, it would be great if we could input a stoichiometric matrix directly instead of listing the reactions in the DSL, as an alternative input. We would also need to input a vector of rates as well.
This would allow those with large biological models to go from a stoichiometric matrix (which can be generated using COBRA tools) directly to an ODEProblem.