EpiModel / EpiABC

Approximate Bayesian Computation for EpiModel on High-Performance Computing Clusters
3 stars 2 forks source link

EpiABC Posterior Performance #10

Open andsv2 opened 5 years ago

andsv2 commented 5 years ago

When running EpiABC to estimate EpiModel parameters, EpiABC produce sample posteriors whose mean accurately hits target statistics. However, when using estimated model parameters, EpiModel has not been able to replicate this performance in practice, e.g. fails to achieve equilibrium within a prespecified time frame.

andsv2 commented 5 years ago

Solution 1: Estimate parameters attuned to HIV prevalence and STI rates separately; i.e. f_1 and f_2 are two separate ABC processes targeting 10.86 and 0.26 respectively. Result: Convergence issues for HIV prevalence. Low HIV prevelance when using rates estimated from STI target stats only.

andsv2 commented 5 years ago

Solution 2: Change estimation algorithm from Lenormand to Drovandi. The Lenormand method attempts to filter posterior through waves, settling on a final sample posterior when proportion of new candidate particles falls below a pre-set threshold. The Drovandi method instead advances through steps using a pre-specified tolerance level (difference between simulated and sample data). Result: Drovandi method produces "similar" results to that of Lenormand.

andsv2 commented 5 years ago

Solution 3: Diagnose each of the epi. curves associated with particles that are accepted under Lenormand algorithm. Note - proper posterior sampling is crucial. Can not just select mean of each posterior when re-simulating (minimum tolerance, i.e. maximum of obj. fuction, is not neccesarily achieved at mean of individual posteriors. For example given theta = (x1, x2,x3), minimum distance to targets stats may not be achieved at mean of x4). Question: particle filtering is based on one run of EpiModel (N = 1). Is this done in practice? No, multiple simulations are run, and a smoothed curve is produced (targets stats as mean, with variability). Should we be filtering particles based on a single run or on their ability to, over N sims, produce epi. stats close to target stats? Why would this be 'better?' What is the difference between the two? Result: Pending (4/16) Looking at multiple runs for each accepted particle from ABC method, with N = 100. Will look at (S-S_obs) where S is simulation statistics and S_obs are observed statistics. Essentially, recovering tolerance for individual particles, but N = 100 vs N = 1.

andsv2 commented 5 years ago

Update: 5/17/2019 Transitioning to new ARTnet workflow; only focused on HIV prevalence. Will make calibration easier (one statistic versus two) however should still focus on a solution that allows for multivariate target statistic calibration.