Open martinmodrak opened 2 years ago
I would be excited to push this forward if you think one of the following (preferably their intersection) can be considered as complex model from your point of view. Links are related issues for each module.
@Dashadower will first attack hierarchical and @hyunjimoon will attack high dimension.
Building on @martinmodrak's small workflow, @tomfid and I will make a document for Bayesian workflow on ODE model for urban dynamics (mdl file).
This satisfies at least two categories above (ODE, high dimension). Moreover, Vicky's research on urban scaling, regressing the log of city index such as congestion or income inequality with log of city population opens up the path toward hierarchical Bayesian modeling. In short, hierarchical Bayesian start from viewing $\beta$ as random variable (hence the name Bayesian regression or Random effect model) and giving a prior distribution. Then, we replace the scalar value of prior parameter to another random variable. This allows multi-level effect of learning from data, placing itself as partial pooling between no-pooling and complete pooling model. This document is a good introduction as it has both Stan code and similar model structure (log-log).
Considering $\beta$ is (averaged) elasticity at fixed time from eq.2.2. in the first paper above, Tom's writing on challenges for informative parameter setting in dynamic models is relevant. Following the excerpt from Bayesian workflow ("a pragmatic idea is to keep the priors and compute reasonable parameter values using the real data. This can be done either through rough estimates or by computing the actual posterior. We then suggest widening out the estimates slightly and using these as a prior for the SBC.") I recommend starting from a tight prior. Absence of prior knowledge or data is an elephant in the room which I aim to address with priordb
project where realistic values of parameter that could be assumed are collectively learned.
We will include:
For 2, sections "Multiplicative error and the lognormal distribution, Weakly informative priors, Priors for system parameters and noise scale" from this case study on population dynamics is a good place to start for setting distribution and parameter for prior. This corresponds to "Specify_implicit" (H5.abc) from this Human-Machine collaboration table (HMC table).
We are consider including:
posterior approximator
modules (MCMC, variational inference, optimization) For 4, translating Vensim's .vpd
to Stan model block is the key as then we can use its optimization engine like this restaurant revenue optimization example.
For 5, the aim is to find the cheaper(-est) computation that reaches conditioned precision (step 9 from HMC table).
stan_builder
I am developing with @Dashadower on @JamesPHoughton's Pysd (currently on stan-backend branch pull-requested here) @jandraor and I am trying this with three example models in https://github.com/Data4DM/BayesSD/discussions/76. @tomfid's help, especially regarding inferencedata is helpful as vensim supporting this format would be crucial in connecting Vensim subscript with hierarchical Bayesian.
Also, @OriolAbril and @ahartikainen are helping connecting this to arviz. Thanks!
@martinmodrak @Dashadower, could current SBC R library's output be easily transformed to inferencedata by any chance? Or would there be any reference codes we can refer to e.g. previous attempts of our community to connect posterior
and arviz
? @Jandraor and I are using different language (R, Python) and wondered whether we can pool our efforts in plots by having a modularized data structure.
cc @mike-lawrence
this issue https://github.com/stan-dev/posterior/issues/85 sounds relevant to interoperability
Below is rough plan which I felt needed for large model workflow. Enjoyable milestone is Bayesian workflow dynamic model casestudy on prey-predator, SEIR, inventory management by around March, 2023. Thank you very much, all!
.nc
), model (.stan
), plots (.png
)For this, I am trying to
connect stanify with Dynamic simulation scenario 1,2 (with @tomfid, @enekomartinmartinez, @tseyanglim, @JamesPHoughton's support)
1's result by putting many .nc
files into one sbc.nc
(with @Dashadower, @OriolAbril, @ahartikainen's support)
connect .nc
output with SBC package via rvar
concept (with @paul-buerkner, @martinmodrak, @jandraor's support)
outputs netcdf format (.nc
). Scenarios to reach .nc
.
.mdl
, .xml
to python objects (support Stella, which stanify lacks now)generator.nc
.mdl
is considering supporting .nc
format, if this happens, it can output generator.nc
and estimator.nc
.mdl
to .stan
and outputs one generator.nc
and estimator.nc
for baseline case (no hierarchy, no prior_draw's')generator.nc
and three (n_prior_draws) number of estimator.nc
for SBC generator.nc
and two (n_subgroups) number of estimator.nc
for hierarchical modeltransform .nc
to rvars
which SBC package supports. Three verifications needed:
.nc
to rvars
here (@mike-lawrence, could you confirm this?)rvars
here (@martinmodrak, could you confirm this?)rvar
-based Bayes visualization + empirical coverage plot is explainable enough for policy specification on dynamic model (@jandraor, could you confirm this?). Matthew's rvars explains how rvar
can make visualization easy by grouping random variable which may be relevant to inferencedata's data variable concept.
Main ideas: