Observed data API - Githubissues

SamuelBrand1 commented 8 months ago

Goal

The goal of this issue is to lockdown how we want EpiAware to ingest observed data, especially in light of #107 .

Current API

The data y_t is an argument to make_epi_aware constructor. If its y_t = missing then EpiAware generates the data from the defined generative model, otherwise it is treated as conditioning the model.

Possible other option: upside and downside

The other Turing approach would be to not have y_t as an argument. Conditioning on observed data is then done using condition or the syntax sugar |,

eg

inference_mdl = condition(generative_mdl, (y_t = my_case_data,))

or

inference_mdl = generative_mdl | (y_t = my_case_data,)

Upsides of this approach

Flexible
Close to the mathematical conception of the model.

Downsides of this approach

Not specifying a clear way of proving data could be:
- Confusing to users
- Relies on knowing variable names inside the model.

seabbs commented 8 months ago

Conditioning on observed data is then done using condition or the syntax sugar |,

Yes I like this approach. Is there anyway to make it easy and discoverable so that uses can know what they can condition and to make sure that condition errors if they pick something that doesn't exist (maybe both of these issues are on dynamicppl or turing.

As an interim measure we should make sure we document what observed data modules expect. We could also have a clear naming scheme for obs (i.e use a obs_ prefix.

seabbs commented 5 months ago

With the introduction of StackObservationModel in #249 we have an opinionated choice to support observations called y_t in the simple case and as a NamedTuple in the complex case which then gives name.y_t. I think this is a clear enough pattern and it seems robust.

Do we think there is more to do here or shall we close in favour of a new issue focussing on how we pass metadata?

seabbs commented 4 months ago

what do we think about this? I feel like we have started to condition more and more of the code on y_t being a pass in and so we should move to drop this? Push back @SamuelBrand1?

CDCgov / Rt-without-renewal

Observed data API #119

Goal

Current API

Possible other option: upside and downside

Upsides of this approach

Downsides of this approach