CDCgov / ww-inference-model

An in-development R package and a Bayesian hierarchical model jointly fitting multiple "local" wastewater data streams and "global" case count data to produce nowcasts and forecasts of both observations
https://cdcgov.github.io/ww-inference-model/
Apache License 2.0
12 stars 1 forks source link

Speed up website build in CI? #79

Open dylanhmorris opened 3 weeks ago

dylanhmorris commented 3 weeks ago

The current bottleneck in building the website in CI is executing the vignette, which takes ~10m.

  1. If we add additional vignettes we may wish to pre-execute the vignettes before building the website, which would enable us to parallelize vignette execution
  2. We may also wish to explore possible ways to avoid re-running the vignette every time (but NB potential setup and/or robustness costs).

Tagging @kaitejohnson, @cbernalz, and @seabbs for discussion.

kaitejohnson commented 3 weeks ago

My thoughts right now are that this is currently almost serving as an end-to-end test (which I know isn't great, but also a proper end-to-end would likely take the same amount of time to run in CI?). In particular, as we have been modifying pre and post-processing it has been a nice way to catch when there are errors in for example changing the names of arguments or elements in a list.

I think that after we put out our first release and we are somewhat committed to stabilizing the format of the outputs from the main fitting wrapper function wwinference() , we should consider doing something akin to what @seabbs set up in epinowcast which triggers rebuilding of the vignette if anything in the vignette changes + scheduled updates on some cadence.

I think as we are developing pretty actively, something like weekly or biweekly seems reasonable. Also worth considering how much speed up we could get by 1.) reducing the complexity of the vignette (e.g. fewer sites) and 2.) modifying the fitting options to reduce warm-up iterations, acceptance probability, etc. I by default used the MCMC specifications we were using in production, but this is likely overkill for a vignette.