List of SSAs to consider implementing

isaacsas commented 6 years ago

A nice review that summarizes many SSA methods is in [1] Ch. 3.

Some SSAs they we might consider implementing (there are others that could be added that they don't examine):

[x] Sorting Direct Method (SDM) (McCollum et al, Comp. Biol. and Chem., 30 (2006))
[x] Next Reaction Method (NRM) (Gibson and Bruck, J. Phys. Chem. A, 104 (9), (2000))
[ ] Table-based NRM (Sanft and Othmer, J. Chem. Phys., 143, 0074108 (2015))
[x] Direct Method with Composition-Rejection (DM-CR) (Slepoy et al, J. Chem. Phys, 128, (2008))
[x] Rejection-based SSA (RSSA) See [1]
[x] RSSA with Composition-Rejection (RSSA-CR) See [1]
[ ] Partial Propensity Direct Method (PDM) (Ramaswamy et al, J. Chem. Phys. 2009)
[ ] PDM combined with Composition-Rejection (PDM-CR) (Ramaswamy et al, J. Chem. Phys. 2010)
[ ] PDM-RSSA, (Thanh, Mathematical Biosciences, 292, 2017)
[ ] FRM-based RSSA? (Maybe something like this which is not accessible). I included NRM not for its performance, which is often not so great compared to those other methods, but as it is very popular and widely known. It would be good to have a reference implementation.

Many of these share common features, like dependency graphs or jump rate data structures, so we should be able to reuse a number of subcomponents in the methods.

[1] reports large network benchmarks for which RSSA/RSSA-CR and the PDM/PDM-CR give the best performance. (The RSSA methods are by the authors of [1].) [2] compares open source simulators and generally finds that the simulator based on PDM/PDM-CR gives the best performance as the number of molecules and network complexity increases, with BioNetgen coming in second (the latter uses SDM -- but it's not clear exactly why they get such good performance).

[1] Marchetti et al. Simulation Algorithms for Computational Systems Biology, see ch 3. [2] Gupta and Mendes, An Overview of Network-Based and -Free Approaches for Stochastic Simulation of Biochemical Systems, Computation, 6, 9, 2018.

isaacsas commented 5 years ago

TODO: Optimizations / Performance:

For NRM type methods the indexed priority queue we're using (from DataStructures.jl heaps) seems slow. It might be worth exploring other implementations (perhaps even cache friendly versions).
For RSSA right now random number generation is rate-limiting and we aren't seeing the reported performance on the problem library gene expression example that they studied. We should look into this. It may also be that our methods to calculate mass action rates could be sped up, say by using static arrays -- they seem to be the other dominant cost.
For DirectCR:
- explore other strategies for min/max bin values. Should we use fewer numbers of bins initially?
- Should we normalize by the max jump rate (which will require rebuilding the table if this rate is exceeded)?
- Should we remove empty bins at the upper end if too many accumulate. If so, resize the vector or just manually keep track of the last bin's index?
- Should we merge the zero and (0,minrate) bins together (eliminating an if statement in the priortogid mapping)?
Benchmark our SSA collection on larger models, compare with BioNetGen, CoPASI, Gillespie.jl, Biosimulator.jl, etc...
For all methods, check about overhead of calling solve multiple times. Should we be more careful about reinitializing data structures instead of just reallocating them as new objects...
For Direct we currently recalculate and cumsum all rates each step. We could optionally allow a dependency graph to avoid recalculating all propensities, and only store the total sum (by adding/subtracting modified propensities). Then we could use a chop down type search that substracts off propensities when sampling, avoiding the full cumsum.

Accuracy:

error tracking in running sums -- including for group sums in CR
thorough accuracy tests for RSSA and DirectCR with established models.

ChrisRackauckas commented 5 years ago

For NRM type methods the indexed priority queue we're using (from DataStructures.jl heaps) seems slow. It might be worth exploring other implementations (perhaps even cache friendly versions).

It is, and we've ran into this somewhat in other spots of DiffEq. If there's a better solution it would be good to get a package out on this that we can use elsewhere.

For RSSA right now random number generation is rate-limiting and we aren't seeing the reported performance on the problem library gene expression example that they studied. We should look into this. It may also be that our methods to calculate mass action rates could be sped up, say by using static arrays -- they seem to be the other dominant cost.

I think we really need to find a way to reliably test against another package for the Direct SSA at least. I know that with the SSAStepper() we do very well without the function wrappers. With function wrappers we take a hit, but how big is that hit in comparison to anyone else that allows general functions to be used? Probably similar but I have no evidence on it haha. The best direct comparison we have right now is to Gillespie.jl where we are quite comparable to them, but I want to do a few performance improvements to them so that way they serve as a good baseline. Simon's implementation is just a loop and a (dense) stoichiometry matrix, so that can be a good baseline of "best possible when the network is small enough". I still want to know where we are in relation to some of the other SSA packages though.

It may also be that our methods to calculate mass action rates could be sped up, say by using static arrays

Yeah, but the limitation is that static arrays need to be constant sized. We could just say that every mass action term has two species, and then special case how single species things are done. Or in v1.0 we small union optimizations come into play here, so mass action terms with static arrays and <=4 different size choices will optimize well.

isaacsas commented 5 years ago

It is, and we've ran into this somewhat in other spots of DiffEq. If there's a better solution it would be good to get a package out on this that we can use elsewhere.

There are lots of implementations that are more cache friendly, and in the literature cache oblivious. Anytime I read papers on the latter they seem really complicated to implement. This link has a collection of more simple optimizations that give speed ups:

http://playfulprogramming.blogspot.com/2015/08/cache-optimizing-priority-queue.html

Generally the performance hit is from having parent/child nodes at locations i and 2i or 2i+1 in the underlying array. Better locality between where parent and child nodes are stored should give improved performance. Of course the tradeoff is that the implementation gets more complicated.

isaacsas commented 5 years ago

I agree we need to benchmark against other solvers. I recently found that the RSSA authors (Marchetti book above) have a package of Java code with many SSAs implemented. We could probably try benchmarking against that. They also have a collection of networks stored in a very simple file format, just

A -> B, rate A+C -> B, rate

which we could easily read in and use to generate DiffEqBiological reaction_networks. So we should be able to redo the timing studies in their papers/book. We could also compare to more established simulators (i.e. Copasi, Bionetgen, etc). They're given as executables though, not API-based, so we'd need to get networks into SBML to load into them, and then figure out how to separate the overhead of loading the network from the actual simulation time.

ChrisRackauckas commented 5 years ago

For the benchmarking, it seems like it would be a great student project so I'll see if I can make it a GSoC or undergraduate research project. For the priority queues, @oxinabox has been putting a lot of work into DataStructures.jl and maybe has thought a bit about this. We should upstream the issue to DataStructures.jl and probably fix it at the source.

oxinabox commented 5 years ago

I've mostly been putting work into DataStructures.jl as a on the admin/maintenance side. But yes, please do upstream that issue, and upstream work to improve it.

paulflang commented 3 years ago

I heard about a parallel implementation of SSA. I think this is the paper about it. Perhaps this paper is also relevant.

SciML / JumpProcesses.jl

List of SSAs to consider implementing #32