Outbreak-analysis / factorialtemplate

a template for factorial simulation dependencies + supercomputing array jobs
0 stars 1 forks source link

incomplete crossing of (type,obs_error,proc_error) #1

Open bbolker opened 8 years ago

bbolker commented 8 years ago

Mike and I would like to propose/argue for a slight modification of the current full-factorial design. In particular, we have (keeping it down to 2x2x2 for illustration) type={discrete,hybrid} x proc={Poiss,bin} x obs={Poiss,bin}. I don't think adding betabin and NB adds anything qualitative to the argument

There are a bunch of issues in matching discrete and hybrid processes within the model, and in matching hybrid (continuous) processes with discrete data:

That leaves us with

process=discrete,bin; obs = bin   (classic CB)
process=discrete,bin; obs = Poiss (classic CB with large pop size)
process=discrete,Poiss; obs = bin   (OK as long we don't accidentally exceed N) 
process=discrete,Poiss; obs = Poiss (ditto)
process=hybrid,bin; obs = Poiss
process=hybrid,Poiss; obs = Poiss

The bottom line is that we would like to be able to exclude certain categories from our full-factorial design. This will also become important later when we're using multiple platforms and e.g. Stan will only work with hybrid models.

It would be nice to be able to specify a list of disallowed combinations, e.g

type=hybrid x proc=* x obs=bin

or

type=discrete x proc=* x obs=* x platform=Stan

I don't know if there's an easy way to read this list from a file and parse it appropriately.

pearsonca commented 8 years ago

This seems doable. I'll need to think a bit about the details, but what seems most sensible is to break out the make-based-loop into a script-based approach that would support a reasonable filtering syntax. On Thu, Apr 21, 2016 at 15:32 Ben Bolker notifications@github.com wrote:

Mike and I would like to propose/argue for a slight modification of the current full-factorial design. In particular, we have (keeping it down to 2x2x2 for illustration) type={discrete,hybrid} x proc={Poiss,bin} x obs={Poiss,bin}. I don't think adding betabin and NB adds anything qualitative to the argument

There are a bunch of issues in matching discrete and hybrid processes within the model, and in matching hybrid (continuous) processes with discrete data:

  • if we have a continuous (hybrid) observation model, we need to round or take slices of the CDF in order to get the probability/distribution of the discrete observations (or assume that the density function is approximately constant between (obs-0.5, obs+0.5)). So maybe we shouldn't do hybrid obs models, at least to start?
  • if we have a continuous process model and a binomial-based discrete observation model, then we need to do some sort of rounding (or multinomial sampling) to get the N (size) parameter for the observation process. e.g. if we have an incidence of 20.8, it's going to be hard to assume that obs ~ Binom(p,N=20.8) ... we could calculate the likelihood of this, but JAGS won't be happy. If we have Poisson-based (Poisson or NB) observation, then this constraint doesn't apply. So perhaps we should not allow this combination, at least to start?

That leaves us with

process=discrete,bin; obs = bin (classic CB) process=discrete,bin; obs = Poiss (classic CB with large pop size) process=discrete,Poiss; obs = bin (OK as long we don't accidentally exceed N) process=discrete,Poiss; obs = Poiss (ditto) process=hybrid,bin; obs = Poiss process=hybrid,Poiss; obs = Poiss

The bottom line is that we would like to be able to exclude certain categories from our full-factorial design. This will also become important later when we're using multiple platforms and e.g. Stan will only work with hybrid models.

It would be nice to be able to specify a list of disallowed combinations, e.g

type=hybrid x proc=* x obs=bin

or

type=discrete x proc=* x obs=* x platform=Stan

I don't know if there's an easy way to read this list from a file and parse it appropriately.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/Outbreak-analysis/factorialtemplate/issues/1

pearsonca commented 8 years ago

Poking around a bit, there does seem to be a continuous (as in, x & N real valued) version of the binomial distribution (e.g., here and here). Thanks, former USSR.

We would still need to round at the final observation stage (...maybe?), but some sort of rounding seems inevitable when using the hybrid model (in an intuitive / implicit sense, not necessarily in explicit calculations). So there may be a simple "always round at this stage" operation that is a no-op for discrete approaches which would let us avoid filtering -- of course, I wasn't there for your detailed conversation, so this may have already been covered.

N.B. regardless of figuring out if we can actually handle all combinations, I still think it makes sense to figure out handling excluded combinations. Those will definitely come up some day in an unavoidable sense, and even more certainly in a practical sense.