SciML / Catalyst.jl

Chemical reaction network and systems biology interface for scientific machine learning (SciML). High performance, GPU-parallelized, and O(1) solvers in open source software.
https://docs.sciml.ai/Catalyst/stable/
Other
456 stars 75 forks source link

Supporting decimal stoichiometric coefficients #161

Closed jimmielin closed 2 years ago

jimmielin commented 4 years ago

Hi, I'm using DiffEqBiological.jl to generate ODEs for solving atmospheric chemistry reactions and some of the reactions in our chemical mechanisms are formulated with fractional stoichiometric coefficients, e.g.

{168} MVK + O3 = 0.064HO2 + 0.05RO2_R + 
       0.164OH + 0.05RCO_O2 + 0.475CO+
       0.1HCHO + 0.95MGLY + 0.351HCOOH :        ARR(7.51e-16,1520.0,0.0);

(This is in KPP format which I am writing a parser for; ARR specifies the rate constant using the Arrhenius equation)

Specifying these kinds of coefficients directly into addreaction! result in a type mismatch. Minimal example:

using DiffEqBiological
testrn = @empty_reaction_network
addspecies!(testrn, :O2)
addspecies!(testrn, :O)
addreaction!(testrn, 2.643e-10, :(0.5O2 --> O))
MethodError: no method matching add_reactants!(::Symbol, ::Float64, ::Array{DiffEqBiological.ReactantStruct,1})
Closest candidates are:
  add_reactants!(::Union{Float64, Int64, Expr, Symbol}, !Matched::Int64, ::Array{DiffEqBiological.ReactantStruct,1}) at /.../DiffEqBiological/myMnx/src/reaction_network.jl:341

I have tried looking into the implementation of add_reactants!, but it seems like the code intrinsically relies on coefficients being Int: apart from the ReactantStruct specifying Int64, computing the reaction rates for the ODE using mass action law requires factorial(coef) (but I may be misunderstanding the code)

It would be good to have support for fractional coefficients as much of our data is in this format, however it would be possible to get rid of this in the pre-processing stage (multiplying eq. *1000 and scaling the reaction rate constants) although this may introduce significant error to the reaction rate constants if we did k^1000. Hope to hear suggestions as to how this would be best approached. Thank you.

isaacsas commented 4 years ago

Does the reactant stoichiometry always have integer coefficients?

edit: Nevermind, I see that you want to allow non-integer reactant stoichiometric coefficients too.

ChrisRackauckas commented 4 years ago

ODEs makes sense, but I am curious what this would mean in the context of jump problems: just have to require Float64 initial conditions and do discrete changes of specific floating point values? It seems odd but seems to make sense.

isaacsas commented 4 years ago

Yes, ODEs could be done, I guess jumps too -- In both cases I think we'd need to treat such reactions as non-mass action and could probably then make it work. It might be some work to update everything to go through in this case though.

TorkelE commented 4 years ago

I'd imagine that it would be quite easy to make a hacky solution enabling this for ODE/SDE, but which would crash stuff whenever one tries to run a jump problem. Making it work here would probably require quite some work, and thinking.

It might also be possible to allow floats as coefficients, but have some way to detect it. And if it is detected that Jump things are never created, and an error returned (explaining what is going on) whenever someone tries to do something Jump-y with such a network.

isaacsas commented 4 years ago

What's the rate law definition for non-integer coefficients though? For ODEs, do we replace factorials with Gamma functions? Or does this model assume that the stoichiometric coefficients only define the exponents and the rest is handled through the rate constant?

ChrisRackauckas commented 4 years ago

For ODEs, do we replace factorials with Gamma functions?

Good point, that's kind of a wild difference.

jimmielin commented 4 years ago

Thanks everyone for the discussion. I agree that if you have non-integer coefficients in the reactants it would wildly change the way rate laws are calculated; the reactions defined that way are very few and it is usually just one reactant or two, so it can be dealt with in the pre-processing stage (also definitions can indeed vary wildly with decimal coefficients; it is a decision best left to the user who knows the chemical mechanism).

Are there any barriers for implementing decimal coefficients in the products? From what I see in the code I haven't seen anything absolutely relying on prod.stoichiometry to be Int64, but I don't know the code well enough to be sure.

Supporting fractional products goes a long way in terms of my specific problem, and we can add some error-checking to disallow decimal coefficients passed to reactants.

isaacsas commented 4 years ago

I'm pretty sure we can get that supported. What needs to be done a bit carefully is making sure it doesn't mess up the mass action reactions we generate for jump models -- they really do need integers throughout.

Right now the easiest way to handle this is probably to flag reactions with decimal (product) coefficients as non-mass action, and route them through that code path in generating jumps, Jacobians and such. I'm happy to take a look at this, but I can't promise anything immediate -- the next few weeks are full of deadlines for me.

jimmielin commented 4 years ago

I've taken a shot at implementing this in https://github.com/jimmielin/DiffEqBiological.jl/commit/46a6b9ef57e9ebfeb2dc3c17f7634b0590cdae13

While it does the heavy lifting (see quick test below), there are a few things about the implementation that might need some further polish now that ReactantStruct has Float64 in it - I'd love to hear your input:

Additionally I've added checks in the constructor for ReactionStruct to check two things:

Quick testing code which shows products being cut by the specified decimal ratio change:

test_int = @empty_reaction_network
addspecies!(test_int, :O2)
addspecies!(test_int, :O)
addreaction!(test_int, 2.643e-10, :(O2 --> 2O))
addodes!(test_int)
prob_int = ODEProblem(test_int, [1.697e+16, 0], (0.0, 10.0))
sol_int = solve(prob_int, alg_hints=[:stiff])
plot(sol_int, layout=(2,1))

test_frac = @empty_reaction_network
addspecies!(test_frac, :O2)
addspecies!(test_frac, :O)
addreaction!(test_frac, 2.643e-10, :(O2 --> 0.5O))
addodes!(test_frac)
prob_frac = ODEProblem(test_frac, [1.697e+16, 0], (0.0, 10.0))
sol_frac = solve(prob_frac, alg_hints=[:stiff])
plot(sol_frac, layout=(2,1))
jimmielin commented 4 years ago

Updating this to get some thoughts because I am not quite sure, however I've ran into an equation of the form:

addreaction!(jlkpp_mechanism, k, :(COOH + OH --> 0.35HCHO + 0.35OH + 0.65C_O2))

It seems like it is formulated to avoid the ambiguities that happen when we use decimals for reactant coefficients. Right now in my patch it will be caught by the check on netstoich (as netstoich will return -0.65 for the OH coefficient) but it doesn't need to be; I assume that it will be OK to accept this equation as the rate is calculated by k[COOH][OH], in here 0.35OH is just a product and isn't factored into the rate calculations (which need integers). Am I missing something? (I am just using ODEs so I might be breaking something in the other models which I don't have experience with...)

Thanks!

isaacsas commented 4 years ago

Gillespie (jump) models may get messed up if we switch product stoichiometry to floats. We would need to go through all the code very carefully. Pure jump models require the populations to be integer valued, and so all operations we generate that change the population need to remain integer valued. While the jump models might still generate and actually run with floating point coefficients, there are issues for correctness and speed that may arise with floats.

My preference would be to make this change in a way that keeps everything integer valued as much as possible, perhaps by making ReactantStruct parametric on the stoichiometry or by having it use a union of floats and ints for the stoichiometry. Even with that though I think there is a fair amount of down-stream updates that may be needed to ensure ODEs, SDEs and jumps still all work correctly.

isaacsas commented 4 years ago

If you want to open a PR with your modifications we can take a look at what you have changed and see if there are clear places that still need modification. It would also run the tests for you so we can see if they are working ok with your changes.

korsbo commented 4 years ago

Just a thought. Stoichiometric constants should be rational numbers. If we only allowed Rationals and not Floats, would not the problem be much easier to solve?

korsbo commented 4 years ago

Then one could use lcm to easily find out how to scale everything to integer values.

isaacsas commented 4 years ago

That does make a lot of sense and would be a better way to handle this than using floats.

jimmielin commented 4 years ago

It makes sense, however I am worried that scaling something like the example equation above MVK + O3 = 0.064HO2 + 0.05RO2_R + 0.164OH + 0.05RCO_O2 + 0.475CO+ 0.1HCHO + 0.95MGLY + 0.351HCOOH would require too large of a factor to work for all the products (e.g. *1000 due to the 0.351 HCOOH term) introducing significant numerical error into the reaction rate, which then propagates into the equations.

I'm opening a PR with the preliminary Float implementation. It could be implemented in a better way, but I'm putting it up for discussion & run the CI tests. Thanks.

ChrisRackauckas commented 4 years ago

It's not uncommon for reaction rates to differ by 10^10 in biochemical models, and those are handled just fine by integrators since what matters is the relative difference between terms. This is seen for example in the Rober benchmark problem:


Applied Mathematics Instructor, Massachusetts Institute of Technology Senior Research Analyst, University of Maryland, Baltimore, School of Pharmacy, Center for Translational Medicine On 10/22/2019 2:12:45 PM, Haipeng Lin notifications@github.com wrote: It makes sense, however I am worried that scaling something like the example equation above MVK + O3 = 0.064HO2 + 0.05RO2_R + 0.164OH + 0.05RCO_O2 + 0.475CO+ 0.1HCHO + 0.95MGLY + 0.351HCOOH would require too large of a factor to work for all the products (e.g. *1000 due to the 0.351 HCOOH term) introducing significant numerical error into the reaction rate, which then propagates into the equations. I'm opening a PR with the preliminary Float implementation. It could be implemented in a better way, but I'm putting it up for discussion & run the CI tests. Thanks. — You are receiving this because you commented. Reply to this email directly, view it on GitHub [https://github.com/JuliaDiffEq/DiffEqBiological.jl/issues/161?email_source=notifications&email_token=AAN25HRVPI2H3LF4CCF5FELQP47B3A5CNFSM4JBTG2YKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEB6V5RI#issuecomment-545087173], or unsubscribe [https://github.com/notifications/unsubscribe-auth/AAN25HTEZTIYYHB7UJ3AMIDQP47B3ANCNFSM4JBTG2YA].

isaacsas commented 2 years ago

We need to update the DSL, but this works now via the Symbolic API:

using Catalyst
@parameters k
@variables t,A(t),B(t),C(t)
rxs = [Reaction(k,[A,B],[C],[.5,2.5],[3.5])]
@named rs = ReactionSystem(rxs,t)
osys = convert(ODESystem,rs,combinatoric_ratelaws=false)

which gives

 Differential(t)(A(t)) ~ -0.5k*(A(t)^0.5)*(B(t)^2.5)
 Differential(t)(B(t)) ~ -2.5k*(A(t)^0.5)*(B(t)^2.5)
 Differential(t)(C(t)) ~ 3.5k*(A(t)^0.5)*(B(t)^2.5)

This isn't likely to work for jumps though, but perhaps we just add a checking for non-integer stoichiometry coefficents in jump conversion and give an error if found.

isaacsas commented 2 years ago

This should now work in the DSL too by https://github.com/SciML/Catalyst.jl/pull/482.