Closed nutterb closed 9 years ago
Why are there 2 dimensions in cpt.angio[pe, ]
? I would have thought this to be a uni-dimensional vector cpt.angio[pe]
.
Same thing for cpt.death[pe,treat,]
(3 vs. 2 dimensions)...
I think this is because the last dimension corresponds to the child node. Brain is foggy as I haven't looked at it in a week or so.
Yup, that's why. It's okay to be foggy still.
why is logistic regression used in pi.angio
and pi.death
?
If I am understanding correctly, the issue is that cpt
will always give tables that represent all values of the child node, where this is not the case for situations in which logistic regression is used.
If I further understand correctly, JAGS is trying to call dcat
with single-element probabilities, and doesn't know what to do with them. Would calling dbern
instead be an option? That is, does dbern
know what you're talking about if you only give it one probability value? If it is, we would maybe need to build logic in which detects situations where the child node has only two values.
So you're suggesting I try
pi.angio[dbern(pe), ]
?
I'll give that a shot and see what happens.
pi.angio <- cpt.angio[pe,]
angio ~ dbern(pi.angio)
???
It looks like I'm going to have to create the vector in the same way I created the vectors for multinomial logistic regression.
model{
pi.gear[1] <- 1 - (exp(-0.32241 + 10.17042*(am == 2) + -0.52456*(cyl == 2) + -12.63344*(cyl == 3) + 0.04434*mpg) / (1 + exp(-0.32241 + 10.17042*(am == 2) + -0.52456*(cyl == 2) + -12.63344*(cyl == 3) + 0.04434*mpg) + exp(-9.41827 + 18.83984*(am == 2) + -0.29636*(cyl == 2) + -0.64508*(cyl == 3) + 0.02031*mpg)) + exp(-9.41827 + 18.83984*(am == 2) + -0.29636*(cyl == 2) + -0.64508*(cyl == 3) + 0.02031*mpg) / (1 + exp(-0.32241 + 10.17042*(am == 2) + -0.52456*(cyl == 2) + -12.63344*(cyl == 3) + 0.04434*mpg) + exp(-9.41827 + 18.83984*(am == 2) + -0.29636*(cyl == 2) + -0.64508*(cyl == 3) + 0.02031*mpg))); pi.gear[2] <- exp(-0.32241 + 10.17042*(am == 2) + -0.52456*(cyl == 2) + -12.63344*(cyl == 3) + 0.04434*mpg) / (1 + exp(-0.32241 + 10.17042*(am == 2) + -0.52456*(cyl == 2) + -12.63344*(cyl == 3) + 0.04434*mpg) + exp(-9.41827 + 18.83984*(am == 2) + -0.29636*(cyl == 2) + -0.64508*(cyl == 3) + 0.02031*mpg)); pi.gear[3] <- exp(-9.41827 + 18.83984*(am == 2) + -0.29636*(cyl == 2) + -0.64508*(cyl == 3) + 0.02031*mpg) / (1 + exp(-0.32241 + 10.17042*(am == 2) + -0.52456*(cyl == 2) + -12.63344*(cyl == 3) + 0.04434*mpg) + exp(-9.41827 + 18.83984*(am == 2) + -0.29636*(cyl == 2) + -0.64508*(cyl == 3) + 0.02031*mpg))
gear ~ dcat(pi.gear)
pi.am[1] <- 0.59375; pi.am[2] <- 0.40625
am ~ dcat(pi.am)
pi.cyl[1] <- 0.34375; pi.cyl[2] <- 0.21875; pi.cyl[3] <- 0.4375
cyl ~ dcat(pi.cyl)
mpg ~ dnorm(20.09062, 0.16592)
}
I feel like that may work but is not the best solution. Somehow we should be able to feed the probability array to JAGS and use either dcat
or dbern
with that array.
When I simply run cpt(angio ~ pe, data=PE)
, I get all the numbers needed:
angio
pe Negative Positive
No 0.9025465 0.09745348
Yes 0.2576419 0.74235808
I noticed that, in the above HydeNetwork(..., data=PE)
call, when cpt
is used to set distributions for pe
and death
, the network$nodeModel
list element is NULL.
I think this may have repercussions in the portion of compileJagsModel()
which works with cpt_arrays
?
using cpt(angio ~ pe, data = PE)
certainly works for the case where pe
has no parents. This gives a static probability model.
However, when we estimate pe
from its own parents, a static probability model for angio
doesn't seem appropriate. It seems we would want to change the probability inputs into angio
based on the estimate of pe
. Am I correct?
If we use the above model, the variables
angio
anddeath
get fit withcpt
because they are both factor variables.I get the following output from
writeNetworkModel
as currently written.This creates and error. in the deterministic defintion of
pi.angio
andpi.death
, JAGS wants the values incpt.death[pe, treat,]
to be probability vectors. But those are both single element probability values returned by the logistic regression. So here's the question, should I write the probability vectors based on the observed values in thecpt
object, or should I do something crazy likepi.pe[1] <- (1 - pe); pi.pe[2] <- pe
, which would allow the input probabilities to be based on the posterior value frompe
? It seems the latter is what I should do, but wanted to know what you think.I feel like we're so so so close.