ImperialCollegeLondon / covid19model

Code for modelling estimated deaths and cases for COVID19.
MIT License
944 stars 271 forks source link

General covariates + optimized model for Rstan 2.21 #75

Closed MansMeg closed 4 years ago

MansMeg commented 4 years ago

Hi!

I'm currently working with extending the model and also included the improvements that are discussed in this Stan discourse thread: https://discourse.mc-stan.org/t/convergence-latest-version-of-stan-issues-with-our-covid-19-model/14100/66

In this PR I generalize the current model to handle any covariate instead of only 6 prespecified covariates. The optimizations proposed by @wds15 is included, but only those improvements that can be run with the current Rstan implementation. The speed improvements are quite substantial the model is roughly 3-4x faster.

Note! I have not changed the prior for alpha to handle different lengths of alpha. Although, this would not matter in your implementation with the specified covariates.

I have now supplied four files:

base_general.R: Adding some data munging to the general covariate format. This can, of course, be polished, although I did want to include only necessary changes to keep the review burden to a minimum. stan-models/base_general.stan: The changes in the stan code - essentially enable any number of alphas fora any type of design matrix. stan-models/base_general_speed.stan: Performance changes in the model, this gives roughly a speedup in the order 3-4x. debug/test_stan_models_identical_lp.R: This file contains code to check that the log_prob is identical for the three models for a setup of parameters (the mean from one of my runs). Hence you can run this code to check that the models are give the identical lp.

If you are interested in merging this - I will replace base.stan with base_general_speed.stan and base.R with base_general.R. This will most likely break the test suite (since the results are slightly different due to the compiler). Hence I included the debug/test_stan_models_identical_lp.R for checking the model code.

Thanks for all the hard work!

MansMeg commented 4 years ago

I have now added @wds15 additional improved model that has the possibility to use within-chain parallelism together with rcmndstan to make use of within-chain parallelism.

Although the speed difference between base_general_speed.stan and base_general_speed2.stan is at my machine marginal (just one core) is negligible, so I guess it is a matter of taste. Although both these models are roughly 3-4x faster than the current model.

All models have been checked and return the same lp__ (see the debug R code file).

s-mishra commented 4 years ago

Hi @MansMeg, thanks a lot for this :) In our new version we will have this a default too