Modifies rateinterval API to accept computed `urate`.

SciML / JumpProcesses.jl

Build and simulate jump equations like Gillespie simulations and jump diffusions with constant and state-dependent rates and mix with differential equations and scientific machine learning (SciML)

https://docs.sciml.ai/JumpProcesses/stable/

Other

136 stars 35 forks source link

Modifies rateinterval API to accept computed `urate`. #288

Open gzagatti opened 1 year ago

gzagatti commented 1 year ago

A problem with the current implementation of Coevolve is that it might compute urate twice. Once when computing the actual urate and another time if rateinterval calls urate.

This problem is clear in the Hakwes benchmark. Below you can see a profile of the simulation:

This PR modifies the API of rateinterval to rateinterval(u, p, t, urate). With this modification, you can see that much less time is spent in rateinterval.

The downside is a more verbose API. The API will also break for when using VariableRate with Coevolve. However, since the aggregator is relatively new, it might be worth considering this change. I am happy to discuss more elegant alternatives.

coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 4044857921

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

For more information on this, see Tracking coverage changes with pull request builds.
To avoid this issue with future PRs, see these Recommended CI Configurations.
For a quick fix, rebase this PR at GitHub. Your next report should be accurate.

Details

6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage increased (+1.3%) to 88.267%

Files with Coverage Reduction	New Missed Lines	%
src/JumpProcesses.jl	1	0.0%
<!--	Total:	1		-->

Totals
Change from base Build 3984319264:	1.3%
Covered Lines:	2099
Relevant Lines:	2378

💛 - Coveralls

isaacsas commented 1 year ago

Let me think about the interface here a bit. This may be the easiest way to go, but I wonder if it would be better to have some kind of dispatch-based resolution where one can either have urate return both, or you provide two independent functions.

gzagatti commented 1 year ago

Take your time. It might also be worth taking into account how to handle the lrate API as well, since it is called after urate and rateinterval.

I tried playing with the multiple dispatch mechanism but could not find an elegant solution.

gaurav-arya commented 11 months ago

Seconding the idea of allowing a single function to return multiple quantities. Also, to throw another spanner into the works: in my use case, it would be most efficient to compute all three of rate, urate, and rateinterval jointly in a single function, as they all share some intermediary work:)

gaurav-arya commented 11 months ago

Perhaps it would be best to just have a separate kwarg? E.g. the user can specifiy rate_and_bounds(u,p,t) returning the triplet of (rate(u,p,t), rateinterval(u,p,t), urate(u,p,t)) (and lrate optionally e.g. as a fourrth value). If only rate_and_bounds were specified and not e.g. rate, then we could populate rate, urate and rateinterval as rate_and_bounds(u,p,t)[1] etc., so that algorithms which don't know about this special form (e.g. those that don't care about the bounds) still work as normal. But we could specialize on it in Coevolve.

isaacsas commented 11 months ago

This requires some thought since I think frequently one won't want to evaluate the rate simultaneous to the rate bounds for performance reasons, so we'd also want to support versions that don't evaluate rate until it is needed in the sampling. We also need to consider that we don't want to get in a situation where we have type instabilities by getting a mix of functions with different return signatures / have to hard code a huge number of possible cases.

I wonder if we might instead have a way of communicating to users that we haven't updated time and/or the system state since the last time one of the rate functions was called (so a user could just cache the values and know they are still valid if it makes sense for them to generate them all at once).