mclements / microsimulation

R package for microsimulation
36 stars 10 forks source link
cpp discrete-event-simulation health-economics r

+TITLE: Microsimulation package for R

+OPTIONS: toc:nil

+OPTIONS: num:nil

+OPTIONS: html-postamble:nil

Babel settings

+PROPERTY: session R-microsimulation

+PROPERTY: cache yes

+PROPERTY: results output graphics

+PROPERTY: exports both

+PROPERTY: tangle yes

+PROPERTY: exports both

[[http://www.gnu.org/licenses/gpl-3.0.html][http://img.shields.io/:license-gpl3-blue.svg]]

The package provides several approaches for microsimulation and event-based, discrete event simulation. The package includes an R implementation of discrete event simulation, building on several R5 classes. This implementation is useful from a pedagogical perspective, but it is slow for larger microsimulations. For speed, we also provide C++ classes for discrete event simulation, building on several well established libraries, including the [[http://www.inf.usi.ch/carzaniga/ssim/index.html][SSIM]] C++ discrete event simulation library, the [[http://www.iro.umontreal.ca/~lecuyer/myftp/streams00/][RngStream]] C library for common random numbers, the [[http://www.boost.org/][Boost]] libraries for making many C++11 tools available to C++98, and [[http://www.rcpp.org/][Rcpp]] for providing glue between R, R's C API and C++.

We specifically developed this package for modelling the cost-effectiveness of cancer screening, where there are many (e.g. 10^7) individuals who are followed from birth to death. Notably, we provide a complete prostate cancer screening model, including tools for cost-effectiveness analysis.

For compiling from source, use steps 2--3.

To install the microsimulation R-package run this in your shell:

+BEGIN_SRC shell :exports code :eval never

R CMD INSTALL path_to_microsimulation

+END_SRC

** Simple examples

+name: commentify

+begin_src emacs-lisp :var result="" :exports none

(concat "#> "(mapconcat 'identity (split-string result "\n") "\n#> "))

+end_src

+BEGIN_SRC R :session R-microsimulation :post commentify(this) :results output :exports both :eval never-export

require(microsimulation) sim1 <- callSimplePerson2(n = 1e5) summary(sim1$events)

+END_SRC

+RESULTS:

: #> state event age number : #> Healthy:338 toOtherDeath :142 Min. : 1.00 Min. : 1.0 : #> Cancer : 0 toCancer :100 1st Qu.: 37.00 1st Qu.: 74.0 : #> Death : 0 toCancerDeath: 96 Median : 58.00 Median : 325.0 : #> Mean : 56.98 Mean : 445.1 : #> 3rd Qu.: 79.00 3rd Qu.: 721.8 : #> Max. :100.00 Max. :1722.0 : #>

+BEGIN_SRC R :session R-microsimulation :post commentify(this) :results output :exports both :eval never-export

sim2 <- callIllnessDeath(n = 1e5, cure = 0.2) summary(sim2$prev)

+END_SRC

+RESULTS:

: #> state age number : #> Healthy:200 Min. : 0.00 Min. : 1 : #> Cancer : 0 1st Qu.: 25.75 1st Qu.: 2030 : #> Median : 50.50 Median : 8096 : #> Mean : 50.49 Mean : 34914 : #> 3rd Qu.: 75.25 3rd Qu.: 79029 : #> Max. :100.00 Max. :100000 : #>

** In-line example: Sick-Sicker example from Krijkamp et al (2018)

We have provided a continuous-time re-implementation of the discrete-time Markov model developed by Krijkamp et al (2018).

The example demonstrates:

For the specific example, the transformation from discrete to continuous time showed that the discrete time formulation could have been improved. In particular, the discrete time formulation assumes no transitions between Healthy and Sicker over one year, while the approximate probability of that event is the one-year probability of moving from Healthy to Sick times the probability of moving from Sick to Sicker. We have included that probability in the transition matrix and using matrix logarithms to calculate the transition probabilities.

+BEGIN_SRC R :session R-microsimulation :results output wrap :exports both

+BEGIN_SRC R :session R-microsimulation :results output wrap :exports both :eval never-export

library(expm) # logm library(Rcpp) # sourceCpp library(microsimulation) # Rcpp::depends and include files library(ascii); options(asciiType="org")

set up the parameters

param <- within(list(), {

Transition probabilities (per cycle) and rates

p.HD = 0.005 # probability to die when healthy
p.HS1 = 0.15 # probability to become sick when healthy
p.S1H = 0.5 # probability to become healthy when sick
p.S1S2 = 0.105 # probability to become sicker when sick
rr.S1 = 3 # rate ratio of death when sick vs healthy
rr.S2 = 10 # rate ratio of death when sicker vs healthy
r.HD = -log(1-p.HD) # rate of death when healthy
r.S1D = rr.S1 * r.HD # rate of death when sick
r.S2D = rr.S2 * r.HD # rate of death when sicker
p.S1D = 1-exp(-r.S1D) # probability to die when sick
p.S2D = 1-exp(-r.S2D) # probability to die when sicker
## Cost and utility inputs
c_H = 2000 # cost of remaining one cycle healthy
c_S1 = 4000 # cost of remaining one cycle sick
c_S2 = 15000 # cost of remaining one cycle sicker
c_Trt = 12000 # (additional) cost of treatment (per cycle)
u_H = 1 # utility when healthy
u_S1 = 0.75 # utility when sick
u_S2 = 0.5 # utility when sicker
u_Trt = 0.95 # utility when sick (as per the code) and being treated
## new parameters
discountRate = 0.03 # discount rate
partitionBy = 1.0 # partition used in the report
Trt = FALSE # Treatment?
debug = FALSE

})

For converting from discrete to continuous time: p.HS2 should be non-zero

param = within(param, { p.HS2 = p.HS1p.S1S2 }) Pmat = with(param, matrix(c(1-p.HD-p.HS1-p.HS2,p.HS1,p.HS2,p.HD, p.S1H,1-p.S1H-p.S1S2-p.S1D,p.S1S2,p.S1D, 0,0,1-p.S2D,p.S2D, 0,0,0,1), 4, byrow=TRUE)) stopifnot(all(abs(rowSums(Pmat)-1)<10.Machine$double.eps)) Qmat = expm::logm(Pmat) # matrix logarithm stopifnot(all(abs(rowSums(Qmat))<10*.Machine$double.eps))

update the rates in param

param = within(param, { r_HS1 = Qmat[1,2]; r_HD = Qmat[1,4] r_S1H = Qmat[2,1]; r_S1S2 = Qmat[2,3]; r_S1D = Qmat[2,4] r_S2D = Qmat[3,4] })

sourceCpp(code=" // [[Rcpp::depends(BH)]] // [[Rcpp::depends(RcppArmadillo)]] // [[Rcpp::depends(microsimulation)]]

include

enum state_t {Healthy, Sick, Sicker, Dead}; enum event_t {toS1, toS2, toH, toD, toEOF}; typedef ssim::SummaryReport<short,short> Report; / Utility: Random exponential using rate parameterisation */ template double rexpRate(T rate) { return R::rexp(1.0/as(rate)); } / Utility: Run a set of simulations for a single process / void runSimulations(ssim::cProcess process, int n) { for (int i = 0; i < n; i++) { ssim::Sim::create_process(process); ssim::Sim::run_simulation(); ssim::Sim::clear(); } } / Define a class for the process / class SickSicker : public ssim::cProcess { public: int id; state_t state; Rcpp::List param; Report report; SickSicker(Rcpp::List param, Report report) : id(-1), param(param), report(report) { } void init(); // to be specified void handleMessage(const ssim::cMessage msg); // to be specified void cancelEvents(); // utility function }; /* Initialise a simulation run for an individual / void SickSicker::init() { id++; state = Healthy; scheduleAt(rexpRate(param[\"r_HS1\"]),toS1); scheduleAt(rexpRate(param[\"r_HD\"]),toD); scheduleAt(31.0,toEOF); // end of follow-up } / Utility to cancel some events */ void SickSicker::cancelEvents() { ssim::RemoveKind(toH); ssim::RemoveKind(toS1); ssim::RemoveKind(toS2); ssim::RemoveKind(toD); } / Handle receiving self-messages / void SickSicker::handleMessage(const ssim::cMessage msg) { if (param[\"debug\"]) Rprintf(\"id: %i, state: %i, kind: %i, previous: %f, now: %f\n\", id, state, msg->kind, this->previousEventTime, ssim::now()); report->add(state, msg->kind, this->previousEventTime, ssim::now(), id); switch(msg->kind) { case toH: state = Healthy; report->setUtility(param[\"u_H\"]); report->setCost(param[\"c_H\"]); cancelEvents(); scheduleAt(ssim::now() + rexpRate(param[\"r_HS1\"]), toS1); scheduleAt(ssim::now() + rexpRate(param[\"r_HD\"]), toD); break; case toS1: state = Sick; report->setUtility(param[\"Trt\"] ? param[\"u_Trt\"] : param[\"u_S1\"]); report->setCost(param[\"c_S1\"] + (param[\"Trt\"] ? param[\"c_Trt\"] : 0.0)); cancelEvents(); scheduleAt(ssim::now() + rexpRate(param[\"r_S1H\"]), toH); scheduleAt(ssim::now() + rexpRate(param[\"r_S1S2\"]), toS2); scheduleAt(ssim::now() + rexpRate(param[\"r_S1D\"]), toD); break; case toS2: state = Sicker; report->setUtility(param[\"u_S2\"]); report->setCost(param[\"c_S2\"] + (param[\"Trt\"] ? param[\"c_Trt\"] : 0.0)); cancelEvents(); scheduleAt(ssim::now() + rexpRate(param[\"r_S2D\"]), toD); break; case toD: case toEOF: ssim::Sim::stop_simulation(); break; default: REprintf(\"Invalid kind of event: %i.\n\", msg->kind); break; } if (id % 10000 == 0) Rcpp::checkUserInterrupt(); / be polite / } / Exported function: Set up the report and process, run the simulations and return a report */ //[[Rcpp::export]] Rcpp::List callSim(int n, Rcpp::List param, bool indivp = true) { Report report(n,indivp); report.setPartition(0.0,31.0,param[\"partitionBy\"]); report.setDiscountRate(param[\"discountRate\"]); SickSicker person(param,&report); runSimulations(&person, n); Rcpp::List lst = report.asList(); lst.push_back(param,\"param\"); return lst; }") simulations = function(n, param, simulator=callSim, indivp=TRUE) { object = simulator(n, param, indivp) stateT = c("Healthy","Sick","Sicker","Dead") eventT = c("toS1", "toS2", "toH", "toD", "toEOF") for (name in c("ut","costs","pt","events","prev")) object[[name]] = transform(object[[name]], state=stateT[Key+1], Key=NULL) object$events = transform(object$events, event=eventT[event+1]) class(object) = c("SickSicker","SummaryReport") object }

define a utility function for using system.time with ascii

ascii.proc_time = function(x, include.rownames=FALSE, include.colnames=TRUE, ...) ascii(summary(x), include.rownames, include.colnames, ...)

run the simulations

set.seed(12345) ascii(system.time(sim1 <- simulations(1e3, param=modifyList(param, list(Trt = FALSE)))),header=TRUE) set.seed(12345) ascii(system.time(sim2 <- simulations(1e3, param=modifyList(param, list(Trt = TRUE)))),FALSE,FALSE) cat("\n") ascii(ICER(sim1,sim2), rownames=c("No treatment","Treatment"), caption="Continuous-time Sick-Sicker model for n=10,000 individuals")

+end_src

+RESULTS:

:RESULTS: | user | system | elapsed | |------+--------+---------| | 0.10 | 0.00 | 0.10 | | 0.09 | 0.00 | 0.09 |

+CAPTION: Continuous-time Sick-Sicker model for n=10,000 individuals

| | Total | | | | Incremental | | | | | | | Costs | (se) | QALYs | (se) | Costs | (se) | QALYs | (se) | ICER | |--------------+----------+--------+--------+-------+---------------+--------+-------+-------+----------| | No treatment | 117835.7 | 2263.0 | 13.101 | 0.160 | | | | | | | Treatment | 223871.9 | 4097.8 | 13.658 | 0.167 | 106036.1 | 1881.9 | 0.558 | 0.013 | 190110.5 | :END:

+BEGIN_SRC R :session R-microsimulation :results output wrap :exports both

+BEGIN_SRC R :session R-microsimulation :results output wrap :exports both :eval never-export

library(parallel) set.seed(12345) mcsimulations <- function(n, simulations, ..., mc.cores = getOption("mc.cores", 2L)) { n.seg <- diff(c((0:(mc.cores-1))*floor(n/mc.cores),n)) do.call(rbind, mclapply(n.seg, simulations, ..., mc.cores=mc.cores)) } ascii(system.time(sim1 <- mcsimulations(1e5, simulations, param=modifyList(param, list(Trt = FALSE)))), header=TRUE) set.seed(12345) ascii(system.time(sim2 <- mcsimulations(1e5, simulations, param=modifyList(param, list(Trt = TRUE)))), FALSE,FALSE) cat("\n") ascii(ICER(sim1,sim2),caption="Continuous-time Sick-Sicker model for n=100,000 individuals")

+end_src

+RESULTS:

:RESULTS: | user | system | elapsed | |------+--------+---------| | 8.25 | 0.03 | 4.14 | | 4.18 | 0.02 | 4.20 |

+CAPTION: Continuous-time Sick-Sicker model for n=100,000 individuals

| | Total | | | | Incremental | | | | | | | Costs | (se) | QALYs | (se) | Costs | (se) | QALYs | (se) | ICER | |-----------+----------+-------+--------+-------+---------------+-------+-------+-------+----------| | Reference | 118425.3 | 234.4 | 13.133 | 0.016 | | | | | | | Treatment | 225548.4 | 424.3 | 13.720 | 0.017 | 107123.1 | 485.7 | 0.587 | 0.023 | 182387.6 | :END:

There was an appreciable difference in the estimates from the discrete-time case and the continuous-time case. This issue warrants further investigation, particularly given that the discrete time case ignores any transitions between Healthy and Sicker within a one-year period.

+caption: Results from Krijkamp et al (2018), Table 2 for their discrete-time microsimulation model with n=100,000 individuals

| | Total | | | | Incremental | | | | | | | Costs | (se) | QALYs | (se) | Costs | (se) | QALYs | (se) | ICER | |-----------+---------+------+-------+-------+---------------+------+-------+-------+--------| | | | | | | | | | | | | Reference | 75996 | 183 | 15.82 | 0.016 | | | | | | | Treatment | 141644 | 343 | 16.38 | 0.016 | 65648 | 164 | 0.561 | 0.001 | 117087 |

+BEGIN_SRC R :session R-microsimulation :results output :exports both

set.seed(12345) invisible(callSim(5, modifyList(param, list(debug=TRUE)))) set.seed(12345) invisible(callSim(5, modifyList(param, list(Trt=TRUE, debug=TRUE))))

+end_src

+RESULTS:

+begin_example

id: 0, state: 0, kind: 0, previous: 0.000000, now: 1.554306 id: 0, state: 1, kind: 1, previous: 1.554306, now: 1.660357 id: 0, state: 2, kind: 3, previous: 1.660357, now: 2.137922 id: 1, state: 0, kind: 0, previous: 0.000000, now: 22.523279 id: 1, state: 1, kind: 2, previous: 22.523279, now: 23.538790 id: 1, state: 0, kind: 0, previous: 23.538790, now: 25.110528 id: 1, state: 1, kind: 2, previous: 25.110528, now: 25.529621 id: 1, state: 0, kind: 4, previous: 25.529621, now: 31.000000 id: 2, state: 0, kind: 0, previous: 0.000000, now: 1.279403 id: 2, state: 1, kind: 2, previous: 1.279403, now: 2.482209 id: 2, state: 0, kind: 0, previous: 2.482209, now: 4.467754 id: 2, state: 1, kind: 2, previous: 4.467754, now: 5.240185 id: 2, state: 0, kind: 0, previous: 5.240185, now: 6.123519 id: 2, state: 1, kind: 2, previous: 6.123519, now: 7.951863 id: 2, state: 0, kind: 0, previous: 7.951863, now: 10.967305 id: 2, state: 1, kind: 2, previous: 10.967305, now: 11.127479 id: 2, state: 0, kind: 0, previous: 11.127479, now: 15.295999 id: 2, state: 1, kind: 1, previous: 15.295999, now: 16.663526 id: 2, state: 2, kind: 4, previous: 16.663526, now: 31.000000 id: 3, state: 0, kind: 0, previous: 0.000000, now: 0.021088 id: 3, state: 1, kind: 2, previous: 0.021088, now: 1.648953 id: 3, state: 0, kind: 0, previous: 1.648953, now: 1.665451 id: 3, state: 1, kind: 1, previous: 1.665451, now: 2.825812 id: 3, state: 2, kind: 3, previous: 2.825812, now: 3.378253 id: 4, state: 0, kind: 0, previous: 0.000000, now: 1.548766 id: 4, state: 1, kind: 2, previous: 1.548766, now: 4.303079 id: 4, state: 0, kind: 0, previous: 4.303079, now: 5.128089 id: 4, state: 1, kind: 2, previous: 5.128089, now: 5.418233 id: 4, state: 0, kind: 0, previous: 5.418233, now: 9.734630 id: 4, state: 1, kind: 2, previous: 9.734630, now: 11.013894 id: 4, state: 0, kind: 0, previous: 11.013894, now: 26.388802 id: 4, state: 1, kind: 2, previous: 26.388802, now: 26.514306 id: 4, state: 0, kind: 0, previous: 26.514306, now: 27.803678 id: 4, state: 1, kind: 2, previous: 27.803678, now: 28.437418 id: 4, state: 0, kind: 4, previous: 28.437418, now: 31.000000 id: 0, state: 0, kind: 0, previous: 0.000000, now: 1.554306 id: 0, state: 1, kind: 1, previous: 1.554306, now: 1.660357 id: 0, state: 2, kind: 3, previous: 1.660357, now: 2.137922 id: 1, state: 0, kind: 0, previous: 0.000000, now: 22.523279 id: 1, state: 1, kind: 2, previous: 22.523279, now: 23.538790 id: 1, state: 0, kind: 0, previous: 23.538790, now: 25.110528 id: 1, state: 1, kind: 2, previous: 25.110528, now: 25.529621 id: 1, state: 0, kind: 4, previous: 25.529621, now: 31.000000 id: 2, state: 0, kind: 0, previous: 0.000000, now: 1.279403 id: 2, state: 1, kind: 2, previous: 1.279403, now: 2.482209 id: 2, state: 0, kind: 0, previous: 2.482209, now: 4.467754 id: 2, state: 1, kind: 2, previous: 4.467754, now: 5.240185 id: 2, state: 0, kind: 0, previous: 5.240185, now: 6.123519 id: 2, state: 1, kind: 2, previous: 6.123519, now: 7.951863 id: 2, state: 0, kind: 0, previous: 7.951863, now: 10.967305 id: 2, state: 1, kind: 2, previous: 10.967305, now: 11.127479 id: 2, state: 0, kind: 0, previous: 11.127479, now: 15.295999 id: 2, state: 1, kind: 1, previous: 15.295999, now: 16.663526 id: 2, state: 2, kind: 4, previous: 16.663526, now: 31.000000 id: 3, state: 0, kind: 0, previous: 0.000000, now: 0.021088 id: 3, state: 1, kind: 2, previous: 0.021088, now: 1.648953 id: 3, state: 0, kind: 0, previous: 1.648953, now: 1.665451 id: 3, state: 1, kind: 1, previous: 1.665451, now: 2.825812 id: 3, state: 2, kind: 3, previous: 2.825812, now: 3.378253 id: 4, state: 0, kind: 0, previous: 0.000000, now: 1.548766 id: 4, state: 1, kind: 2, previous: 1.548766, now: 4.303079 id: 4, state: 0, kind: 0, previous: 4.303079, now: 5.128089 id: 4, state: 1, kind: 2, previous: 5.128089, now: 5.418233 id: 4, state: 0, kind: 0, previous: 5.418233, now: 9.734630 id: 4, state: 1, kind: 2, previous: 9.734630, now: 11.013894 id: 4, state: 0, kind: 0, previous: 11.013894, now: 26.388802 id: 4, state: 1, kind: 2, previous: 26.388802, now: 26.514306 id: 4, state: 0, kind: 0, previous: 26.514306, now: 27.803678 id: 4, state: 1, kind: 2, previous: 27.803678, now: 28.437418 id: 4, state: 0, kind: 4, previous: 28.437418, now: 31.000000

+end_example

*** Limitations of the in-line approach

One limitation for the in-line code is that common random numbers, which are manipulated in C++ and use R's random number functions, are not available. Common random numbers can be used in a package, which is used by the [[https://github.com/mclements/prostata][prostata]] package.

** Extensive use case For more advance use of the microsimulation framework, please have a look at our prostate cancer natural history model:

[[https://github.com/mclements/prostata]]