CDCgov / DynODE

CDC/CFA/Predict/Scenarios ODE Modeling Framework
Apache License 2.0
3 stars 0 forks source link

Allow user to provide likelihood distribution in the JSON config #45

Closed arik-shurygin closed 6 months ago

arik-shurygin commented 8 months ago

Goals Effort to allow the config JSON files to stand more on their own as an explanation of the model. Moving as much of the distribution selection process to the config, away from separate python files.

Solution Allow the user to specify which distribution they would like to use for likelihood calculations between observed and model metrics.

Implementation Addition of some sort of keyword like "INCIDENCE_DIST" within the JSON config that specifies what type of distribution the observed metrics will be compared with. Need for a partial call because this distribution will not actually have parameters until runtime, since that is how likelihood is calculated and used to update the sampler.

Currently the last lines of MechanisticInferer.likelihood() look like this:

    model_incidence = jnp.diff(model_incidence, axis=0)

    # sample infection hospitalization rate here
    with numpyro.plate("num_age", self.config.NUM_AGE_GROUPS):
        ihr = numpyro.sample("ihr", Dist.Beta(0.5, 10))

    # scale model_incidence by the ihr, then apply NB observation model
    k = numpyro.sample("k", Dist.HalfCauchy(1.0))
    numpyro.sample(
        "incidence",
        Dist.NegativeBinomial2(
            mean=model_incidence * ihr, concentration=k
        ),
        obs=obs_metrics,
    )

we want to replace Dist.NegativeBinomial2() with a call to whatever distribution is provided in the config file. Something like this

numpyro.sample(
        "incidence",
        self.config.INCIDENCE_DIST(
            mean=model_incidence * ihr, concentration=k
        ),
        obs=obs_metrics,
    )
arik-shurygin commented 6 months ago

This might be something too involved to be a config parameter. Many incidence distributions, for example negative binomial, take multiple parameters. Often those parameters are themselves sampled or depend on variables within the mechanistic_inferer.likelihood() function. Thus this is something best left for an explicit override.