epinowcast / epidist

An R package for estimating epidemiological delay distributions
http://epidist.epinowcast.org/
Other
9 stars 4 forks source link

User interaction with Stan code #78

Closed athowes closed 1 month ago

athowes commented 1 month ago

The #70 PR raised some issues about how we want the user to be able to interact with the Stan code passed in for specific models (in this case ltcad). This will be an ongoing issue as more models are added.

Currently:

Questions:

Moving forward:

athowes commented 1 month ago

As an example of bad thing due to current implementation:

fit_gamma <- epidist(
  data = prep_obs,
  formula = formula,
  family = epidist_family(prep_obs, family = "gamma"),
  priors = priors,
  stancode = epidist_stancode(
    prep_obs,
    family = epidist_family(prep_obs, family = "gamma")
  )
)

I think if you just did:

stancode = epidist_stancode(prep_obs)

then it would put in the default as epidist_stancode(prep_obs, epidist_family(prep_obs) which would give you the lognormal family. Perhaps easy fix to this with default arguments another way.

seabbs commented 1 month ago

For this specific example can't you pass family in the same way you can pass data

seabbs commented 1 month ago

I.e epidist_standata(data, family)

athowes commented 1 month ago

Issue here about how close to put the family related code together. Prefer to have it all in function but then that means having a combination of the family object and the Stan object together. Might be preferable (if) when we also have the family as a class

athowes commented 1 month ago

I propose to:

Thoughts on this proposal @seabbs @kgostic @parksw3? (I will bring this up next meeting [today] and record here anyway.)

My reasons for preferring this:

Uncertainties:

athowes commented 1 month ago

Third option:

Obvious option to change:

athowes commented 1 month ago

I have started working on this in https://github.com/epinowcast/epidist/pull/115.

So far what I have done is move stancode <- epidist_stancode(data = data, family = family) inside of epidist.default rather than have it as an argument. This fixes the bug I had (#110) as follows:

> meanlog <- 1.8
> sdlog <- 0.5
> obs_time <- 25
> sample_size <- 200
> 
> sim_obs <- simulate_gillespie() |>
+   simulate_secondary(
+     meanlog = meanlog,
+     sdlog = sdlog
+   ) |>
+   observe_process() |>
+   filter_obs_by_obs_time(obs_time = obs_time) %>%
+   .[sample(seq_len(.N), sample_size, replace = FALSE)]
> prep_obs <- epidist_prepare(sim_obs, model = "latent_individual")
> fit_gamma <- epidist(data = prep_obs, family = epidist_family(prep_obs, family = "gamma"))
Compiling Stan program...
Start sampling
Running MCMC with 4 sequential chains...

Chain 1 Iteration:    1 / 2000 [  0%]  (Warmup) 
Chain 1 Iteration:  100 / 2000 [  5%]  (Warmup) 
Chain 1 Iteration:  200 / 2000 [ 10%]  (Warmup) 
Chain 1 Iteration:  300 / 2000 [ 15%]  (Warmup) 
Chain 1 Iteration:  400 / 2000 [ 20%]  (Warmup) 

This matches the specification under "third option" in comment https://github.com/epinowcast/epidist/issues/78#issuecomment-2173444719 but I don't think is the complete solution we had in mind.

It's not clear to me the best next step now. Options as I see them:

  1. Leave it as it is
    • Note here: is it possible to put another argument stancode_overwrite = NULL such that if it's used then the user could put their own custom Stan code in?
  2. Move the "family" related function into the epidist_family.latent_individual function which then outputs some other object (I suggested a list in https://github.com/epinowcast/epidist/issues/78#issuecomment-2173102206 and don't recall any alternative proposals).
    • epidist.default would then contain code to reassemble the Stan code from the family-specific part and other part
    • The family argument could be removed from epidist_stancode (this is good because currently it's bad to pass it)
    • This raises point about how "default" the epidist method is now becoming

Tagging @seabbs to give input on this before I move forward with implementing.

seabbs commented 1 month ago

Nice issue.

but I don't think is the complete solution we had in mind.

Could you flesh this out a bit please?

athowes commented 1 month ago

Yep, as to why I had that impression:

  1. I recall an interest in keeping the family function code with the family object (this solution does not do that)
  2. Perhaps something I don't like about the current use of stancode <- epidist_stancode(data = data, family = family). Specifically I am unsure how this will work within epidist.default once we have other models. But perhaps it's fine.

If these do not seem important issues I would be happy to move forward with the solution as written in the PR currently, and then we can revisit it when we add more models.

I would also open an issue for "add custom Stan code prior" would would entail:

athowes commented 1 month ago

@seabbs ping here -- could you give thoughts before I either look to get a "leave as is" solution merged, or try implementing the "move the family into epidist_family option"?

seabbs commented 1 month ago

I think leave as is (the 2. Point is handled by S3 dispatch)

And make issues for your point about custom priors and reorg the family function I think?

athowes commented 1 month ago

I have made an issue for the priors point (https://github.com/epinowcast/epidist/issues/123).

I don't fully understand what making an issue for the family function reorganisation would entail but I am going to try.