nutterb / dbn

Simulation of Discrete-Time Dynamic Bayesian Networks
0 stars 0 forks source link

Initial Structure #2

Closed nutterb closed 7 years ago

nutterb commented 8 years ago

I suspect the initial setup will be similar to HydeNet.

We will need an object on which to act, so let's call that dbn. It will need the following attributes

The node attributes need to include

Rather than gathering all of the information to build a model, let's instead have them directly insert the model. If we aren't pushing the simulation out to JAGS, we don't need as diverse an object to accommodate direct JAGS coding.

The only immediate downside I see to this is it won't let use simulate from models for which we don't have an object; such as a published model. I'm not quite sure how to handle that yet.

Need a formula and a list method.

We will have to restrict the use of xtabs to a single variable. Unless I can be given a reasonable interpretation of what a multivariable xtabs should look like in a network.

jarrod-dalton commented 8 years ago

I don't know - this is complicated. I'm not suggesting an alternative here, but just jotting down some thoughts.

A really simple-minded object hierarchy would be something like

tbn | --> temporalNode | --> staticNode

With some other detail & sub-classes, obviously.

I guess the point is that the object structure might look different enough for temporalNodes (compared to staticNodes) that we might want to make them explicit classes.

For temporalNodes, we would need to specify node dependencies (names of nodes on which the node in question depends, not worrying about the nature of the relationships).

There are some important details here:

For example, we might want to model cholesterol(t) as a function of sex, age((t-5):(t-1)) and cortisol((t-2):(t-1)). We'd need to specify: 1) that cholesterol, age and cortisol are temporalNodes 2) that the sex node is a staticNode 3) that cholesterol node depends on sex, age and cortisol 4) that the dependency of cholesterol on age involves the previous 5 values of age (t-1, t-2, ..., t-5)and that the dependency of cholesterol on cortisol depends on the previous 2 values of cortisol (t-1 and, t-2) 5) equations/distributions/models for all nodes, being careful that such equations/distributions/models are only allowed to reference what has been specified in the dependencies

Of course, an automatic means of populating the dependencies based on what's embedded in the model objects would be very helpful.

jarrod-dalton commented 7 years ago

Just beginning to play around with this. Sorry about the delay. What follows are some rather free-form notes as I work my way through the code. I'll leave it up to you to parse this into notable issues (if any).

nutterb commented 7 years ago
  1. I did intend for dag_structure to be unexported. I am just storing it in its own file, instead of defining it in dbn. I don't normally export utility functions unless there is a belief that the user may find it useful in other settings. I can't think of a use case for dag_structure elsewhere right now.

  2. parallelization is fairly simple to accomplish, and will be considered when I actually get to simulation. I'll prepare that argument in whatever function does the simulations.

  3. I had been operating under the assumption that we were abandoning the truly Bayesian approach. This had been one of the reasons I was exploring package names around "Dynamic Systems" instead of "Dynamic Bayesian." If you want to grow this into the Bayesian, eventually, I'll need to rethink the strategy again. It won't change a lot, but if this is eventually going Bayesian, doing a purely Bayesian implementation makes a lot more sense. JAGS and STAN are both inherently multi-threaded, (eliminating the need for us to implement parallel methods) and are indifferent to whether we are searching forward or backward in time (eliminating the need for us to implement checks that we only go forward). If this is where we want to go, I'll need to think hard about whether dbn should be an extension of HydeNet, or an alternative to HydeNet or as a HydeNet 2.0. The answer to that question isn't immediately obvious to me.

jarrod-dalton commented 7 years ago
  1. How about a copout? "dbn" could just as well stand for "dynamic belief networks", which doesn't explicitly address whether or not they are truly Bayesian inference machines. I do like the idea of sticking with a strictly forecasting (now into the future, no past) package, for reasons stated above. It also allows for a much richer class of models that could feasibly be incorporated into the system (e.g., any function that takes parents as inputs and outputs some vector of predicted values for the child node)