Armavica / rebop

Fast stochastic simulator for chemical reaction networks
https://armavica.github.io/rebop/
MIT License
44 stars 2 forks source link

full trajectory #16

Closed dave-doty closed 1 month ago

dave-doty commented 1 month ago

I'm sorry if this isn't the correct place to ask this...

Is there a way to get the full "trajectory", i.e., the entire sequence of reactions that Gillespie samples, together with the inter-reaction times (sampled exponential random variables), rather than sampling from fixed time points, in which several reactions (or none) might have occurred between two consecutive time points?

In particular, I'm hoping to use the Python interface for this, so if that is possible from the Python API, that would be great.

dave-doty commented 1 month ago

I looked into the source code and it doesn't seem to be supported, there's this loop that goes through many reactions without recording the intermediate configurations:

https://github.com/Armavica/rebop/blob/a34e193ee384ceb4687dbe7aa5c2af6df6f9d171/src/gillespie.rs#L238-L264

So this is a feature request to support getting this full trajectory information out of the simulation.

Armavica commented 1 month ago

Thank you for your message! That's exactly the right place, I'll be happy to add this to the next release.

Armavica commented 1 month ago

@dave-doty I added the feature that you requested, that you can activate by passing nb_steps=0 in the Python run function. It will stop at the first reaction after tmax. I am lacking time currently to fine tune the performance or the API of this function, so these might change in a future release: notably, I am wondering if I should always return tmax as a last point (which never corresponds to a reaction time), or the first reaction time after tmax (which corresponds to a reaction time when it is finite, and doesn't when it is infinite). Please let me know if that works for you, or if you think that this could be improved! I will now release version 0.8.0 so you can install it with pip.

dave-doty commented 1 month ago

Thanks! Seems to work great.

One question: is the volume parameter supported directly, or should this be handled by adjusting rate constants (e.g., to implement volume $20$, divide bimolecular rate constants by $20$, and trimolecular rate constants by $20^2$, etc.).

dave-doty commented 1 month ago

BTW I think returning only the actual reaction times (not tmax) makes the most sense, I think code would expect only actual reaction times to appear. I also like your convention of adding one more time at "inf" to represent that the system reached a terminal state before time tmax.

dave-doty commented 1 month ago

Can I also say how awesome it is that you implemented this in one day? We found your package after trying to get Gillespy2 to do the same thing, with the discussion there suggesting that the problem is uncomputable: https://github.com/StochSS/GillesPy2/issues/427

Armavica commented 1 month ago

One question: is the volume parameter supported directly, or should this be handled by adjusting rate constants (e.g., to implement volume 20, divide bimolecular rate constants by 20, and trimolecular rate constants by 202, etc.).

Volume is not currently supported, but that could be the next feature, if there is interest for it! If you want to open a new issue for this, it will remind me to work on it at some point.

Thank you for your feedback on the API! I am glad that it works for you.

Can I also say how awesome it is that you implemented this in one day? We found your package after trying to get Gillespy2 to do the same thing, with the discussion there suggesting that the problem is uncomputable

Thank you :blush: Most of the functionality was already there, I just had to make it return all the events generated. They are right that this might be slower for simulations with a huge number of steps happening (mainly because of the need to allocate a large amount of memory to store all this data), and computationally it's anyway always going to be a bit faster to go through steps in a loop without worrying to return all the results.

Finally, something that I am not quite satisfied about currently is that even with a fixed seed, the returned trajectory will not be the same if you choose different nb_steps values. They are all correct trajectories, but this is a consequence of the current implementation of the simulation loops with nb_steps > 0. I would like to revisit this one day, to make all discretizations return the exact same trajectory.