SciML / Catalyst.jl

Chemical reaction network and systems biology interface for scientific machine learning (SciML). High performance, GPU-parallelized, and O(1) solvers in open source software.
https://docs.sciml.ai/Catalyst/stable/
Other
456 stars 75 forks source link

support general mass actions? #329

Closed JianghuiDu closed 2 years ago

JianghuiDu commented 3 years ago

If I'm correct only rate law of the form https://catalyst.sciml.ai/stable/tutorials/models/#Reaction-rate-laws-used-in-simulations is supported, right? That seems two restrictive. I my field rate laws are almost never written in this format. Rather they are often completely empirical, derived from "fitting" experimental observations, and thus come in all kinds of forms. For example the mass action of aA+bB=cC may as well be dA/dt = -k*[A][B] or k[A]/([A]+K)*[B]rather than dA/dt = -k*[A]^a[B]^b. Is it possible to allow user supplied rate formats in the future?

isaacsas commented 3 years ago

You can specify general functions of state as rate laws, see https://catalyst.sciml.ai/dev/tutorials/advanced/#User-defined-functions-in-reaction-rates

JianghuiDu commented 3 years ago

I see your point. But then you have to write 3 reactions instead of 1? That balloons up very quickly. But thanks for the reminder.

isaacsas commented 3 years ago

Why do you have to write three?

JianghuiDu commented 3 years ago

Don't you need to do this? Or am I missing some tricks?

a*k*[A][B], A⇒∅
b*k*[A][B], B⇒∅
c*k*[A][B], ∅⇒C
isaacsas commented 3 years ago

Sorry, you wrote three different rate laws in your original post. What precise reaction are you trying to write, and what ODE models do you want to recover? You can't currently use parameters as stoichiometric coefficients if that is what you are asking about. You can use an arbitrary rate law for the whole reaction. So you can do

rn = @reaction_network begin
       k, 3*A + 5*B --> 2*C
end k 

or

rn = @reaction_network begin
       k/(A+K), 3*A + 5*B --> 2*C
end k K

for specific values of the coefficients a, b and c.

anandijain commented 3 years ago

i have no dog here, but it seems odd that you can't use parameters? what if you have non-autonomous control over a rxn rate and want to use the new MTK metadata system?

isaacsas commented 3 years ago

We'll get stoichiometry parameters eventually, perhaps even by modifying https://github.com/SciML/ModelingToolkit.jl/pull/942

The issue, and it is an issue with that PR too, is that it is not clear how losing the concrete typing of stoichiometry vectors will affect performance for large systems. It may be a better design to have different reaction types when more generality is needed than to allow normal Reactions to have stoichiometry vectors of type Any. Another concern is that using parameters means that even if your reaction is mass action it will not be realized as mass action for jump problems (i.e. becoming a ConstantRateJump instead of a MassActionJump), which has significant performance implications. We'd need another analysis layer in JumpProblem construction to avoid that.

That PR stalled since I need to play around it with and do some benchmarking, which I'm hoping to get to once the semester ends and my time frees up in a few weeks. Hopefully at that time we can look into more general functions for stoichiometry too.

anandijain commented 3 years ago

You didn't link a PR? Maybe you meant https://github.com/SciML/ModelingToolkit.jl/pull/942?

isaacsas commented 3 years ago

Yeah, that one.

JianghuiDu commented 3 years ago

Sorry, you wrote three different rate laws in your original post. What precise reaction are you trying to write, and what ODE models do you want to recover? You can't currently use parameters as stoichiometric coefficients if that is what you are asking about. You can use an arbitrary rate law for the whole reaction. So you can do

rn = @reaction_network begin
       k, 3*A + 5*B --> 2*C
end k 

or

rn = @reaction_network begin
       k/(A+K), 3*A + 5*B --> 2*C
end k K

for specific values of the coefficients a, b and c.

I might have confused you. I mean general mass actions rather than rate laws.

rn = @reaction_network begin
       k, 3*A + 5*B --> 2*C
end k 

gives image

If I disable the default mass action using

rn = @reaction_network begin
       k, 3*A + 5*B ⇒ 2*C
end k 

I get image

However, what I want is image

To achieve that, I need to break the reaction in to three and disable mass actions at the same time. Basically moving the mass action into the the raw expression.

rn = @reaction_network begin
       3k*A*B, A ⇒ ∅
       5k*A*B, B ⇒ ∅
       2k*A*B, ∅ ⇒ C
end k 

Is there a better way to do this? This approach kind of defeats the purpose of the DSL.

JianghuiDu commented 3 years ago

We'll get stoichiometry parameters eventually, perhaps even by modifying SciML/ModelingToolkit.jl#942

The issue, and it is an issue with that PR too, is that it is not clear how losing the concrete typing of stoichiometry vectors will affect performance for large systems. It may be a better design to have different reaction types when more generality is needed than to allow normal Reactions to have stoichiometry vectors of type Any. Another concern is that using parameters means that even if your reaction is mass action it will not be realized as mass action for jump problems (i.e. becoming a ConstantRateJump instead of a MassActionJump), which has significant performance implications. We'd need another analysis layer in JumpProblem construction to avoid that.

That PR stalled since I need to play around it with and do some benchmarking, which I'm hoping to get to once the semester ends and my time frees up in a few weeks. Hopefully at that time we can look into more general functions for stoichiometry too.

I didn't notice that parametric stoichiometry is not allowed. Another bummer for me since many of my chemical compounds are parametric like C[m]H[n]O[p] where m,n,p are parameters. One purpose of the reaction model is to fit those parameters.

TorkelE commented 3 years ago

If I understand it right, currently a reaction> N1X1+N2X2+... --> M1Y1+M2Y2+... is interpreted dX1/dt = -N1 ([X1]^N1)/N1! ([X2]^N2)/N2! + ... dX2/dt = -N2 ([X1]^N1)/N1! ([X2]^N2)/N2! + ... ... dY1/dt = M1 ([X1]^N1)/N1! ([X2]^N2)/N2! + ... dY2/dt = M2 ([X1]^N1)/N1! ([X2]^N2)/N2! + ...

However, you are working in an application where you require the interpretation: dX1/dt = -N1 [X1][X2]... dX2/dt = -N2 [X1][X2]... ... dY1/dt = M1 [X1][X2]... dY2/dt = M2 [X1][X2]... (possibly everything modified by some constant rate parameter)

Am I correct?

I have to admit, I have never seen you interpretation, had I been working with it myself when we implemented the tool I would probably have added it. Currently the only two ways we have to interpret and reaction is:

It would be possible to add another interpretation (like yours), and designate arrows for it. However, I think most people active on the package right now are currently busy, so I am not sure if you could expect such a feature to appear any time soon.

Out of curiosity, do you have some reference for this type of interpretation? What field exactly are you in which uses it?

JianghuiDu commented 3 years ago

My background in in marine biogeochemistry and to be honest all the reactions I deal with have non-standard mass actions. The reason is that these reactions are macroscopic and observational "relationships" faked into microscopic molecular reactions. So the rate law/mass actions are entirely empirical. And it's a necessity out of simplification that we do this. Essentially, all the reactions are "catalyzed" by often unknown systems of biological communities. That's why it's not possible to use real molecular rate laws/mass reactions. In theory you can write in the format of standard mass reactions, but it's pointless if the everything you get is empirical anyway. For reactions like aA+bB=cC, if you can fit the data well using k[A][B] why would you need k[A]^a[B]^b? The same goes with parametric "chemical compounds`, because you are treating groups of organisms, as one single compound and so the stoichiometry varies a lot (imagine a school of fish as a compound). I think this is very common in Earth and Environmental biogeochemistry.

Also when I say "general" mass action I don't mean specifically the ones in my examples. I mean it could be anything. So for me only rate laws exist, mass actions don't: I just merge mass actions into rate laws. The stoichiometry of the reaction only applies to rate laws, not mass actions.

I created my reaction code automatically using my own functions that take in user supplied expressions of chemical equations and rate laws, which is not optimized of course. Was just curious if it can be done with Catalyst.jl. If it's hard to implement that's ok.

TorkelE commented 3 years ago

Makes sense.

I think we had some chat some time ago about generalizing reactions, but this I think is one of the first real examples (except for using non integer/stochastic stochiometric coefficients) that I have seen where this would be actually useful. It would be cool if one could define a function of a number of coefficients (for a number of substrates/products), supply this to Catalyst, and then this function would be used for defining a rate law.

Also, a side questions, I presume you (and other marine biochemists) only use deterministic models then? Or would you have some kind of SDE/Jump process for stochastic simulations?

isaacsas commented 3 years ago

Doesn't this do what you want then?

rn = @reaction_network begin
              k*A*B, 3*A + 5*B ⇒ 2*C
       end k

I agree though, we need to add/allow for stoichiometry parameters. Data fitting is a very good reason to need them! Hopefully we'll have this by the end of May.

yewalenikhil65 commented 3 years ago

I agree though, we need to add/allow for stoichiometry parameters.

i hope in this case existing stoichiometry functions based on constant stoichiometric, won't need a drastic revamp ?

isaacsas commented 3 years ago

That’s the goal.

JianghuiDu commented 3 years ago

I thought the stoichiometry is also disabled when you disable the mass action. Cool that this works.

Doesn't this do what you want then?

rn = @reaction_network begin
              k*A*B, 3*A + 5*B ⇒ 2*C
       end k

I agree though, we need to add/allow for stoichiometry parameters. Data fitting is a very good reason to need them! Hopefully we'll have this by the end of May.

That would be great!

isaacsas commented 2 years ago

https://github.com/SciML/Catalyst.jl/pull/488 adds parametric stoichiometry via directly building Reaction(...)s. So you can now use that once it merges (i.e. use the DSL for non-parametric reactions and just extend/add on the extra parametric reactions to the already built system -- making sure to also add on any new species or parameters).

The one caveat is that ModelingToolkit upconverts all parameters from integer to floating point if you pass a tuple of mixed types for the initial condition mapping. This then causes an error with factorial functions in the default rate law form. You can either use osys = convert(ODESystem, rn; combinatoric_ratelaws=false) or ODEProblem(rn, ...; combinatoric_ratelaws=false) to disable our rescaling of rate laws by these factors (which should avoid the issue for ODE systems). Alternatively, you can pass a tuple with just numbers for the parameters to ODEProblem, which bypasses ModelingToolkit looking at it (i.e. don't do (:A => 1.0, :n => 1) but do (1.0,1) consistent with the ordering reported by parameters(rn)). I'll add some examples in the HISTORY.md file once the PR gets merged.

I'm going to close this as I've got a new issue on getting this to work in the DSL too (which may take a while longer since it will entail a decent amount of changing around macro parsing code I think).