Biological-Computation / crn-engine

Tools for programming chemical reaction networks (CRN), DNA strand-displacement circuits (DSD) and genetically engineered circuits (GEC).
MIT License
6 stars 4 forks source link

Problem: ClassicDSD simulation needs more precision #5

Open martinhelfrich opened 2 years ago

martinhelfrich commented 2 years ago

I think the simulation of ClassicDSD has rounding / precision issues.

When does the issue arise? When simulation CRN in ClassicDSD where the rate constants are numbers with very different magnitudes.

How can one produce the problem? Simulate the following CRN

directive simulation { final=20; points=1000}
directive simulator stochastic

| On + A->{1} On + A + A
| On + A + A+ A->{1E-15} 

| 200 A
| 1 On

What happens? The A population grows and the second reaction never occurs. The grows continues way beyond 10^6 x A. Screenshot 2022-03-11 at 18-06-49 Classic DSD Tool

What should happen? Once the A populiton is large enough, the second reaction should be come very likely. This should disable the system. Other tools predict that this happens around 10^5 x A. But basically always before 10^6 x A.

martinhelfrich commented 2 years ago

Also: Check out this system. That does not seem right...

directive simulation {final=2; points=100000}
directive simulator stochastic

| A ->{1} A + A
| A + A + A + A->{1E-7} A + A + A

| 300 A

Screenshot 2022-03-11 at 18-31-43 Classic DSD Tool

This does not seem right... The second reaction should be more likely than the first. Thus, we expect the level to drop.

Porpensities in state 300A:

martinhelfrich commented 2 years ago

Is this maybe an overflow error?

ph1ll1ps commented 2 years ago

Thanks for pointing this out. In the Windows binary version of the tool, both of these examples give rise to the following error:

Simulation: Arithmetic operation resulted in an overflow.

Unfortunately, this error is not caught in the JavaScript web version, which is a bug. Thanks for sharing. By the way, what tool did you use to simulate the CRN with improved precision?

Visual DSD typically only generates binary reactions for simulation by the CRN stochastic simulator, which is perhaps one of the reasons why we haven't spotted this bug before.

martinhelfrich commented 2 years ago

I used PRISM. It is mainly a model checker but there is a simulation functionality.

I wrote a script that produces a PRISM CTMC model for a given CRN (and some bounds for each species). Then I can simulate the CTMC in PRISM.

martinhelfrich commented 2 years ago

Also, I just found another (possibly related?) issue.

This system:

directive simulation {final=100; points=1000}
directive simulator stochastic

| ->{100} X
| X ->{1} 
| X ->{1} Y
| X + X + Y ->{0.0001} X + X + X

| 100 X
| 100 Y

Is supposed to oscillate arround state s1 =(100X + 100Y). See PRISM simulation. This can also be checked by computing the propensities for s1. However, the ClassicDSD simulation oscillates around (100X + 200Y). See screenshot.

Is this because of the ternary reaction? brusselator_correct100

Screenshot 2022-03-14 at 19-48-23 Classic DSD Tool

ph1ll1ps commented 2 years ago

Could it be related to the kinetic rate law assumption? See page 20 of the Visual DSD manual at https://ph1ll1ps.github.io/project/visualdsd/ , pasted below for convenience. Coincidentally, directive simulation {final=100; points=1000; kinetics = Deterministic} appears to give the desired simulation results.

The kinetics setting defines the kinetic rate law to be used for simulation. This is needed due to an important difference between the kinetic rate laws of stochastic and deterministic systems, specifically in the case of homomultimer formation, which is the formation of complexes involving multiple copies of the same molecular species. The stochastic rate law includes a k! divisor, where k is the multiplicity of the species, which is not present in the deterministic rate law. For example, the reaction A + A →k B has a deterministic rate law of k[A]^2 , where [A] represents the concentration of species A, but a stochastic rate law of k[A]([A] − 1)/2, where [A] represents the population of species A. This difference means that the limiting behaviour of the stochastic mean with increasing copy numbers does not converge to the deterministic rate equations, as is commonly thought. Only when the stochastic rate laws are equal to the deterministic rate laws will this be the case. By default, kinetics is set to Contextual, which means that a deterministic rate law is assumed for the deterministic simulator, while a stochastic rate law is assumed for the stochastic, cme, lna and mc simulators. Alternatively, the kinetic law can be fixed to either the Stochastic or Deterministic convention, so that it is the same for all simulators

ndalchau commented 2 years ago

Yes, @ph1ll1ps is right. Depending on the stochastic rate semantics, you can get different results! It seems that your model in PRISM is using the semantics for the deterministic rate equations. You can very easily change this in DSD.

image

ndalchau commented 2 years ago

Regarding the numerical problem identified above, this is the piece of code where the reaction propensities are calculated:

match r.rate with
| Rate.MassAction rate -> 
  (match Mset.to_mlist r.reactants with
    | [] -> rate
    | [{multiplicity=1;element=a}] -> rate * (getpop a)
    | [{multiplicity=2;element=a}] -> 
       (* Special case - two members of the same species *)
       let pop = getpop a 
       rate * pop * (pop - 1.0) * 0.5
    | [{multiplicity=3;element=a}] -> 
       let pop = getpop a 
       rate * pop * (pop - 1.0) * (pop - 2.0) / 6.0
    | [{multiplicity=1;element=a};{multiplicity=1;element=b}] ->
       (* Simple case - two different species *)
       rate * (getpop a) * (getpop b)
    (* Use binomial coefficients for the general case! *)
    | aa -> aa |> Lib.fold_left (fun acc entry -> acc * (float) (Lib.binom ((int32) (getpop entry.element)) (entry.multiplicity))) rate )

As you can see, we specifically handle multiplicity=3, performing rate * pop * (pop - 1.0) * (pop - 2.0) / 6.0. It might be that if you were to add an equivalent handling for multiplicity=4, the problem above would go away.

The immediate workaround is to encode the 4A -> 3A reaction using a functional rate:

image

martinhelfrich commented 2 years ago

Thank you @ndalchau and @ph1ll1ps for your help with the kinetec rate law assumption. Indeed, I (and my PRISM model) where using a different one.

In fact, I wasn't using either the deterministic nor the stochastic one as defined above. For A + A →k B the three different rates where:

where [A] represents the number of molecules of species A.

I was following the definition given by Brijder in Section 4.1 of this paper (or in the corresponding preprint).

After thinking about this, the only one that makes intuitive sense to me is the stochastic one. It just assumes you pick agents at random out of a hat. The deterministic one pick two molecules in order while assuming there are many / the numbers are large, Does this mean, it treats A + B -> C + D as different to B + A -> C + D? And the one by Brijder is somehow related to both. It picks in order AND assumes that the number of molecules is small. Again, this should mean we need to add A + B -> C + D and B + A -> C + D, right? Maybe Brijder just simplified the stochastic rate law equation as he just misses a factor that can be encoded in the rate constant?

Do you have some literature on the different rate law assumptions? There probabily is not "a best" / "default" assumption, is there?

ndalchau commented 2 years ago

Hi @martinhelfrich ,

The one you used is actually what we use for "deterministic" kinetics when doing a stochastic simulation. Basically, you just don't divide by n!

For reactions involving different species, none of this really matters. In both stochastic and deterministic kinetics, the propensity is k * n_A * n_B. There is no need to invoke the binomial distribution in this case because you're not choosing k molecules from a population of size n. You're always choosing 1 molecule from size n. Accordingly, there is no n_A-1 or n_B-1, nor is there n! So, A + B -> C + D has the same propensity as B + A -> C + D in both cases.

ph1ll1ps commented 2 years ago

Regarding your question about the default assumption, by default the kinetics option is set to Contextual, which means that a deterministic rate law is assumed for the deterministic simulator, while a stochastic rate law is assumed for the stochastic, chemical master equation, linear noise approximation and moment closure simulators, which all assume some underlying noise due to stochastic effects. This contextual assumption was chosen based on the conventions used by the different communities - deterministic versus stochastic. The Gillespie77 paper provides a useful introduction to stochastic kinetics https://pubs.acs.org/doi/10.1021/j100540a008, or see other sources about the Gillespie algorithm. The deterministic setting allows for more flexibility in modelling assumptions.

martinhelfrich commented 2 years ago

Thanks again for the help. I will check out the references!