tjhladish / AbcSmc

Sequential Monte Carlo Approximate Bayesian Computation with Partial Least Squares parameter estimator
GNU General Public License v3.0
15 stars 7 forks source link

scenario DSL #9

Open pearsonca opened 2 years ago

pearsonca commented 2 years ago

One of the core AbcSmc capabilities is management of scenario projections: AbcSmc will systematically build up a scenario run plan and then run that plan.

However, there are use-case gaps and pain points. AbcSmc currently only directly supports "all combinations" build, so excluded combinations then have to be manually pruned after construction. This is wasteful (when there's lots of pruning), error prone / onerous (because user has to write pruning code), and may not be particularly portable (e.g. if AbcSmc supports different backends in the future, users would have to write different pruning for each backend).

AbcSmc also doesn't distinguish between parameters (fitted elements) versus settings (scenario assumptions). That smells a bit funny, though it's hard to specify the precise problems there. For future capabilities, might want have AbcSmc work either from a computed prior (i.e. the resulting posterior cloud from fitting) or from a sampling one (i.e. just draw a sample from this analytic distribution), and having distinct parameters versus settings seems a natural boundary to set which would make that swap easier.

So, in general, we ought to formally describe what the domain specific language (DSL) for AbcSmc is. This is currently captured ad-hoc in the json configuration files; these blend engineering settings, fitting parameters, targets, scenarios parameters, but also often neglect the actual desired outputs.

pearsonca commented 1 year ago

related example problem that would be solved by distinguishing parameters from scenario settings: while parameters need to be numeric for the PLS element (though possibly ordinal or even cardinal categorical variables could work? have to look into that), doesn't necessarily make sense for that to constrain scenario settings.

Separating scenarios out would mean extending the simulator to accept something like:

struct ScenarioSettings {
  vector<int> integer_vals;
  vector<double> numeric_vals;
  vector<string> string_vals;
}

i.e. a more flexible / natural match for inputs.

However, such an extension would likely be breaking. Possibly can be formulated with some defaults, so that older code can be updated just by adding an ignored input argument (and the extension to have scenario table would have be done in a way that doesn't force old code with scenario settings declared as parameters to suddenly start having scenarios).