grantbrown / ABSEIR

A Flexible and Computationally Efficient SEIRS Modeling Framework
6 stars 3 forks source link

Question regarding summary statistics and distance metric #18

Closed Arminn-Potgieter closed 3 years ago

Arminn-Potgieter commented 3 years ago

Hello

I was wondering whether the use of summary statistics in the acceptance step of the ABC algorithms has been implemented in the package and if not, would there be any way to accomplish this?

I am aiming to show the effects of utilizing neural networks as proxies for sufficient summary statistics within an ABC algorithm (as shown in a paper titled "Learning Summary Statistics for Approximate Bayesian Computation via Deep Neural Network" by Jiang et al. 2017) when fitting a spatial SEIR model to COVID-19 data.

Furthermore, I'm assuming that the distance metric used to determine whether to accept proposed values is the Euclidean distance (less important, just want to confirm)?

Any assistance with this matter will be greatly appreciated

grantbrown commented 3 years ago

Hi there - currently, the euclidean distance is the default, and it's possible to configure a little bit by specifying different powers. Unfortunately, we're not currently set up for arbitrary user specified summary statistics.

In the past, I've thought a bit about how to tackle this issue, as the distances are currently calculated within the main simulation loop in C++. That's obviously not super extensible, but it wouldn't prevent a user from using ABSEIR just for the simulation part for speed and then doing the resampling/acceptance manually in R. The code for that might be a little ugly with the way the package is currently configured, but not infeasible. If some example code would be helpful I should be able to get to that next week.

The paper that you've cited also has some of the flavor of emulator methods - if you haven't checked it out, that sort of work has also been done with compartmental models.

Arminn-Potgieter commented 3 years ago

Thank you very much for your thorough and speedy reply. Some example code would be greatly appreciated, thank you very much! I will also check out the emulator methods, thank you for the tip!

grantbrown commented 3 years ago

Sorry for the delay - after digging around, this test file shows what I was getting at:

https://github.com/grantbrown/ABSEIR/blob/master/tests/testthat/testDistanceModels.R

The idea is that you can set up dummy inputs and a set of parameters of interest, and simulate as many epidemics as you like. There's still going to be considerable overhead processing that on the R side, but accessing it via C++ would be more complex (and probably wouldn't fit with what you had in mind).