Closed TorkelE closed 6 years ago
For mass action reactions rate_org
would just be the numerical rate constant? In that case, just to think out loud:
Building mass action jumps will use substrates
, products
, rate_org
and is_pure_mass_action
. Along with syms
in the reaction_network
. I think that should be all that is needed.
For any reactions where is_pure_mass_action=false
I just use the ConstantRateJump
from the reaction_network
jumps
field.
For SSAs needing more dependency info I use syms
from the reaction_network
, substrates
, products
and dependents
to build the dependency graph. When creating these SSAs I think I'll need to first reorder reactions by those that are mass action, followed by non-mass action (or vice-versa). This can be done in the DiffEqBiological
JumpProblem
constructor.
Ok, I have an implementation on my computer. What it does right now is:
1) Update the ReactionStruct
to contain:
struct ReactionStruct
substrates::Vector{ReactantStruct}
products::Vector{ReactantStruct}
rate_org::Any
rate_DE::Any
rate_SSA::Any
dependants::Vector{Symbol}
is_pure_mass_action::Bool
2) In the jumps
vector passed into jump problems all ConstantRateJump
s which can be phrased as a MassActionJump
is now replaced with a MassActionJump
.
I have not made the Regular jumps yet.
Right now I am only waiting for the actual MassActionJumps to become a part of Julia DifferentialEquations so that I can check that everything is working as expected.
Great, thanks!
If you Pkg.checkout("DiffEqJump")
that should get you the latest version I think by checking it out from github. I guess there hasn't been an official release with it yet (but it is in the latest master).
Thanks, that works. Trying stuff out now.
Ok, it seems like you can only pass a single MassActionJump
to a JumpSet
and that all reaction which uses MassAction will have to be combined into a single Jump, while the ConstantRateJump
s and VariableRateJump
s have to be one Jump object for each reaction.
Right now it is quite nice to have a length n tuple with all the jumps where the i'th jump corresponds to the i'th reaction. If I merge the MassActionJumps into a single jump, while keeping the others separate it will no longer be possible to tell what jump corresponds to what reaction. It will be possible to create a JumpProblem
from it (which I guess is the main point...), but it feels like were are losing something.
I could rewrite the JumpProblem
dispatch we have for reaction_network
s, making it merge the MassActionJump
s (is there a method for this), create a JumpSet
and then from that the JumpProblem
.
Right, a MassActionJump
, like a RegularJump
, is monolithic and responsible for many jumps. Just one of them is responsible for handling all mass action reactions.
Are you putting the MassActionJump
into the reaction_network
as the jumps
field? Maybe you should keep the jumps in the network structure as all ConstantRateJumps
, but when JumpProblem
is called have the dispatch go through and convert all mass action reactions into one MassActionJump
?
I don't think there are a direct way to convert ConstantRateJump
s into MassActionJump
s, however I could quite easily make the JumpProblem
recalculate the MassActionJump
s. Out of curiosity, why are the RegularJump
s and MassActionJump
s monolithic, while the ConstantRateJump
s and VariableRateJump
s single?
The MassActionJump
is monolithic as there are different ways one can represent a big (mass action) chemical reaction network. One common representation would be as a single sparse matrix -- we're essentially using a simplified version of that representation right now. It seemed reasonable to just work with such a monolithic representation. I have to admit, I am thinking about whether there would be any possible benefit and/or drawback to having a MassActionJump
for each reaction instead. @alanderos91 any thoughts/comments? If there is a compelling reason to change to individual MassActionJumps
I'm willing to make the updates.
The RegularJump
is for methods like tau-leaping, where there may be vectorized operations that can be used to determine the total state change and how many times each reaction fires during the tau-leap timestep. So I there are clear performance reasons there for using a global representation.
EDIT: So I think @alanderos91 has found that for small mass action reactions having a dense matrix representation can offer the best performance. We could easily add such a specialization to support alternative storage of the mass action part of the network, but it would only be possible by using a global MassActionJump
.
I don't think there are a direct way to convert ConstantRateJumps into MassActionJumps
Correct, you would need to build the global MassActionJump
from scratch in the JumpProblem
dispatch.
The MassActionJump is monolithic as there are different ways one can represent a big (mass action) chemical reaction network. One common representation would be as a single sparse matrix -- we're essentially using a simplified version of that representation right now. It seemed reasonable to just work with such a monolithic representation. I have to admit, I am thinking about whether there would be any possible benefit and/or drawback to having a MassActionJump for each reaction instead. @alanderos91 any thoughts/comments? If there is a compelling reason to change to individual MassActionJumps I'm willing to make the updates.
We can just have it merge MassActionJump
s internally in the aggregation phase. The fact that it's one object on the inside doesn't mean that it needs to be one object in the API since two stoichiometries are easy to compose. So this really isn't a performance thing, it's whether we add the convenience, and it would make sense to allow it to be 1-by-1 just like ConstantRateJump IMO.
The RegularJump is for methods like tau-leaping, where there may be vectorized operations that can be used to determine the total state change and how many times each reaction fires during the tau-leap timestep. So I there are clear performance reasons there for using a global representation.
This is because they allow arbitrary rate function, so there's a clear performance gain of one function vs many.
I see. I will leave it to you guys to decide what JumpSet
accept and doesn't, but now you have my input. When I know I will PR the corresponding changes. I guess we won't pull it anyway until the latest changes to Jump is live. Meanwhile @isaacsas know what he have to work with for dependencies.
I will add a single RegularJump
to the structure as well, which only makes sense if there is no time dependent reaction.
I will add a single RegularJump to the structure as well, which only makes sense if there is no time dependent reaction.
RegularJumps can have time-dependent reactions since they are only ever approximate.
They can? I was under the illusion that they couldn't. That is good, would it mean that there aren't any plausible reactions that couldn't be handled by RegularJump
s?
That is good, would it mean that there aren't any plausible reactions that couldn't be handled by RegularJumps?
Indeed. They can even handle VariableRateJumps. All of them are handled at the order of its integrator, which for example with tau-leaping is O(dt).
Good Stuff.
@Gaussia Could you put in the PR that adds the new reaction fields at least for now? I'd like to be able to work with reaction_networks
instead of typing out the stoichiometry by hand.
Could be done, give me a second.
Thanks! I'll try to get to adding support for multiple MassActionJumps
this week, but it might not be till the weekend.
Fixed some final bugs, it is up now.
@Gaussia The latest DiffEqJump
master adds a simpler form of MassActionJump
. You don't need vectors of vectors for a jump that will only represent one reaction (i.e. you can create a vector of many jumps, each representing one reaction). For example, for A + B -> C
you can now use
rate = 1.
reactant_stoich = [1 => 1, 2 => 1]
net_stoich = [1 => -1, 2 => -1, 3 => 1]
jump = MassActionJump(rate, reactant_stoich, net_stoich)
You can then create a JumpSet
with a vector of such jumps and a vector of any ConstantRateJump
s. Everything should work at that point. For a zero order reaction like 0 -> A
you can use
reactant_stoich = [0 => 1]
or
reactant_stoich = Vector{Pair{Int,Int}}()
One suggestion (just from looking at your commented code); if you create an empty stoich vector (or vector of vectors) to push to, make sure you create it with the correct type. i.e. don't use
reactant_stoich = []
use
reactant_stoich = Vector{Pair{Int,Int}}()
And a tag should be merged in just a few hours.
Thing are more or less ready, but I've encountered errors when running the test files. After some looking it seems like I get errors whenever creating a reaction network containing parameters. Could this have something to do with the MassActionJumps? It seems like their reaction rates cannot be parameter dependant, e.g.
model = @reaction_network some_type begin
kB, X+Y-->XY
end kB
and
rates = p[1]
reactant_stoich = [1 => 1, 2 => 1],
net_stoich = [1 => -1, 2 => -1, 3 => 1]
does not correspond.
If this indeed is the issue I could modify it so that MassActionJump
s are only created if the reaction rate is given as a constant.
@Gaussia @ChrisRackauckas Pinging you both to get feedback here.
The current setup assumes that MassActionJumps
are initialized with the numerical rate constant.
You could initially set them to zero and then update them in the JumpProblem
constructor using the parameter tuple/vector. That's going to lead to one additional problem though; how do we know which parameter corresponds to a given MassActionJump
reaction? I don't think there's any way currently for us to figure this out within DiffEqJump
; so either we need a mapping that tells us what reaction id each MassActionJump
reaction corresponds to in the parameter vector, or we need the parameter vector to be sorted as we store rates internally (i.e. all mass action jumps come first, then constant rate jumps). This is actually also an issue that will crop up with dependency graphs; there's no way internally for DiffEqJump
to construct a dependency graph for systems with ConstantRateJumps
. If we let the user pass one in, it would ideally order rates as we do internally (again, mass action jumps then constant rate jumps). Otherwise we'd need to remap it to the ordering we use (which would again require us knowing a mapping from internal reaction id to the original order used by the user).
Any thoughts on how to proceed?
@Gaussia An immediate fix for now would be to set the rate when you construct the MassActionJump
to be 0.0
. Then in the DiffEqBiological
overloaded constructor for JumpProblem
replace the rate with the correct rate from prob.p
.
I think that should work out ok. However I would then probably want to keep the jump set in the reaction network with ConstantRateJump
s and VariableRateJump
s only, it feels dangerous to have that around with fault jumps in it. Then I would create everything from scratch in the constructor. I am also thinking whenever I could move that part of the code into the actual JumpProblem, it wouldn't hurt to have that available when constructing jumps in general, and not only in this specific case
(The JumpProblem constructor already have access to parameters through the discrete problem that is passed into it).
I am not sure if I understand the longer problem that you describe, the parameters as inputed state which they are (p[1]
) and the parameter vector is sorted.
Internally in DiffEqJump
we take all the jumps you pass in and merge the mass action jumps into one big mass action jump struct, while storing the constant rate jumps as a vector or tuple. We don't right now have any way to say that the third mass action jump reaction corresponded to the fifth jump in the original input array/tuple.
More generally, with the number of ways that DiffEqJump
allows JumpSets
to be constructed, I don't think we can guarantee that we can always reconstruct this info. (If someone creates their own vector of ConstantRateJumps
and a single MassActionJump
that holds all mass action reactions, we have no way to know what reactions the individual MassActionJumps
correspond to in the parameter vector.)
I assume p[1]
is a symbol we would need to evaluate? I think it would be better to keep the mass action jumps purely numerical, and figure out a way to initialize and/or reinitialize them with the right value. It may be that we need to add a reaction_index
field to MassActionJump
that we can use to index into the parameter vector.
There shouldn't be a problem evaluating the parameter values at problem creation and doing so as early as possible is probably sound.
However, I am uncertain where the ordering of the jumps becomes a problem. Does this have to do with the dependency graph? If you are thinking of the parameters that shouldn't need to be a problem, e.g. p[1]
does not need to be the reaction rate of the first reaction (could be of the second, alternatively it could be the rate of several different reaction if e.g. this corresponds to the dilution rate. One could also have a reaction with rate 3*p[1]*p[2]/4
).
That does remind me however, for ConstantRateJump
s an similar the rate is given as a function of several variables, including p (since it e.g. can depend on expression of different concentrations) . For MassActionJump
s the rate is a simple numeric and their is no way of really indicating such things, since it right now is only a numeric. This means that it is harder to create a general solution since that requires some serious changing to MassActionJump
s.
However this is only really a problem if you want people to be able to create their of MassActionJump
s with parameters they play around with. With some variant of you earlier suggestion it should be perfectly possible to make reaction_networks
work with this system. That might be the simplest approach.
That's right, the jump ordering becomes an issue with the dependency graph. Imagine one creates a reaction system of the following form: (crj = constant rate jump, maj = mass action jumps) [crj1, maj1, maj2, crj2, maj3, crj3]
Internally we split this into a vector [crj1, crj2, crj3] and a mass action jump storing [maj1, maj2, maj3]. We've therefore lost information on the original jump ordering. This could of course be fixed internally, but there are other input cases where we can't figure out the original ordering. Imagine you yourself just create [crj1, crj2, crj3] and one mass action jump holding [maj1,maj2,maj3], and then pass in a JumpSet
storing them. There is no way we can figure out the original ordering. This causes issues for dependency graphs, because we need to know when a given reaction occurs which other reactions should be updated. If you pass us a dependency graph with indices based on the original ordering that mixes jumps, we have no way to resolve that back to our internal ordering after the jumps have been split up. Just to be clear, in the original ordering above an example dependency graph might be:
dg = [
[1, 2, 4] # these jumps depend on jump 1
[1, 2, 5] # these jumps depend on jump 2
[3] # ...
[1, 4, 6]
[2, 5]
[3,6]
]
If you pass us this dep graph, we need a way to map our internal ordering of the jumps back the ordering you used in creating the dep graph, unless you ordered it as we do (mass action jumps first, then constant rate jumps).
I think maybe the easiest fix for now is for you to do what I suggested in the JumpProblem
constructor. If we decide it's the right approach down the line, I can add an optional field to MassActionJumps
called rate_index
, which we could later use to allow users to specify which parameter index holds the rate constant (i.e. the jump would use p[rate_index]
).
I see. However is there a way for users to provide a dependency graph right now? How/When is it calculated?
I updated my previous comment with an example.
I have a sorting direct method implementation ready to go. It can calculate the dependency graph automatically for systems with just MassActionJumps
, but requires the user to pass in a dependency graph for systems that include constant rate jumps too.
Right now it would only work if the user orders all mass action jumps before all constant rate jumps in making their dependency graph.
I think that maybe the easiest approach would be to have DiffEqBiological
create a dependency graph with the same ordering as we use internally. This could be done in the JumpProblem
constructor for algorithms that need dependency graphs. I'm happy to write this translation code once we've got a version up and running that works for SSAs without dependency graphs.
Ok, I think I understand what is going on then.
Cool, thanks for working on this! I'm very much looking forward to being able to use DiffEqBiological again for inputting networks. Typing the stoichiometry vectors by hand is not fun...
Just small update, I am travelling at the moment and will be back beginning of May (if you start wondering if the update will happen). Will make the appropriate changes some time after I get home.
I went ahead and put in a PR with a first attempt to do the conversion within the reaction_network
based JumpProblem
constructor.
@isaacsas and @ChrisRackauckas Moving some stuff over to a separate conversation.
I plan to update the
ReactionStruct
to something like this:I want to have as few fields as possible while including everything that is needed.
rate_org
is the rate as inputed in the DSL. This could e.g. be3.5
. The rate needs to be saved in its original form sinceMassActionJumps
needs it as input. Trate_DE
the rate as used in ODEs and SDEs, modified after mass action (unless ⟾ is used...). Something like3.5*X^2/2!
.rate_SSA
yet a third form is needed for rates for normal jumps (likeRegularJump
s andConstantRateJump
s). This is different from therate_DE
since it might modify copy number like3.5*X*(X-1)/2!
.dependants
is a list of dependants, as required to make dependency graphs.is_pure_mass_action
is true if a MassActionJump is used.Unless we need something else, or if there is a way to make some of this redundant this is probably what we will go with. A primary purpose of this is so that @isaacsas knows what he will have to work with for making dependency graphs (you start on that end).