Handle figure in Catalyst introduction which affects build times

SciML / Catalyst.jl

Chemical reaction network and systems biology interface for scientific machine learning (SciML). High performance, GPU-parallelized, and O(1) solvers in open source software.

https://docs.sciml.ai/Catalyst/stable/

Other

437 stars 71 forks source link

Handle figure in Catalyst introduction which affects build times #948

Closed TorkelE closed 3 weeks ago

TorkelE commented 3 weeks ago

Doc build times have increased recently, and I narrowed it down pretty much to this figure in the introduction to catalyst page:

Basically, the rendering of large figures using e.g. plot can really increase doc build times disproportionally. Right now I simply replaced it with a version already generated and saved from file (which admittedly means that the display is not built dynamically). An alternative would be to reduce the sampling (which has already been reduced, probably for this purpose). However, to have a significant effect, I think we would have to reduce it enough that one really see an effect in the plot.

Something should be done though. This small change reduced my local build time from almost 700 seconds to just under 90 seconds (!). The impact on the GitHub build time will probably be a bit less pronounced (but likely higher once the caching fix is incorporated). And quick local build times are also useful when developing docs.

isaacsas commented 3 weeks ago

How did you determine this is the cause? We've had this figure without issues for years now right?

isaacsas commented 3 weeks ago

Are we sure that this isn't actually an issue with the SciML solution object's interpolation? That seems more likely to have changed than anything else.

TorkelE commented 3 weeks ago

I think local build times have always been relatively large, and I only recently looked at it? I think the interpolation issue is only related to ODEs.

Right now I just bisected stuff, since I know that relatively recently I built it in around 100 seconds. Try building them locally with/without this figure yourself and check the times (the first build after you start Julia should be longer, as stuff have to compile).

isaacsas commented 3 weeks ago

OK, but in any case nothing in our core tutorials should be static, even if it improves doc build times.

So we need another solution that doesn't complicate the tutorial, or we need to just accept there is some weird interaction going on here that we'll have to resolve later.

isaacsas commented 3 weeks ago

What if you don't use save_positions but then plot only at times 0:1000?

TorkelE commented 3 weeks ago

This is a version with saveat every 10 time units, and only plotting the first 1000 seconds. It moves us down to 120 seconds, however, the plot is obviously not good enough

TorkelE commented 3 weeks ago

The easiest long-term solution where we still build everything dynamically is probably to use a simpler first model (i.e. SIR or Michaelis-Menten), those the plots should have no impact from. We can then move the repressilator to the end (and maybe even show how the reactions can be automatically generated programmatically, without typing them out) and maybe do an ODE simulation.

However, that is more of a revamp then what we can do now. If we don't plan to do much doc building anytime soon we can leave it, but having a single figure increasing doc build times by a factor 7 does not make sense.

isaacsas commented 3 weeks ago

Right, what I suggested is very different. It saves the exact solution, and then only plots at fixed points by evaluating of the solution at the times we want to plot. That is different than saving only at fixed points and plotting, which may then evaluate the solution object incorrectly at other time points via piecewise constant interpolation (so we probably shouldn't do that anyways).

isaacsas commented 3 weeks ago

Generally if you want to save a jump simulation at fixed times you need to ensure it is only plotting at those times and not interpolating and plotting at other time points (which may be going on under the hood if you don't give explicit times to plot at in the call to plot).

isaacsas commented 3 weeks ago

If you don't want to deal / investigate this I will handle it when I have time and update the docs accordingly before we release V14.

TorkelE commented 3 weeks ago

I just wanted to show the impact of plotting that number of data points on build times. Doing a proper sampling has the same build time (120s).

We do not have to deal with this before v14 at all, but I was struggling with the doc build times yesterday, and when I had managed to narrow it down I figured I'd raise the issue.

isaacsas commented 3 weeks ago

Plotting 1000 points shouldn't destroy build times and take 100s. That is insane and is indicative of a problem in something between Documenter, interpolation, or the plot recipe.

If you just plot(1:1000,1:1000) does that take 120s? That is instantaneous on my computer, as is saving such a plot as either png or svg...

TorkelE commented 3 weeks ago

Ohh, there is definitely something fishy going on (I have reported this issue to the documented people as well).

isaacsas commented 3 weeks ago

What if you explicitly make the plot a png?

jprob = JumpProblem(repressilator, dprob, Direct())
sol = solve(jprob, SSAStepper())
plot(sol, fmt = :png)  
# or plot(sol, plotdensity = 1000, fmt =:png)

This is super fast locally outside of Documenter for me and the resulting file is only 86KB.

TorkelE commented 3 weeks ago

Yeah, I think this is the way, apparently Documenter generates figures in a range of formats and it might be one of those that causes the problem.

I will close this one for now. There were another couple of plots where I used various solutions to fix it (reducing samples, reducing a number of simulations in an ensemble, or printing a loaded one). Later on I can just open a new PR which sorts all of these cases

isaacsas commented 3 weeks ago

Yeah, I would suggest we not modify examples due to this issue. Instead just make the illustrative plot call not run and add a hidden plot call that makes a png.

TorkelE commented 3 weeks ago

Yes, this is brilliant.

plot(sol)
plot(sol, plotdensity = 1000, fmt =:png) # hide

takes us down to 100 second 🥳

isaacsas commented 3 weeks ago

Great, now we just need to remember to do this going forward.