Posterior Predictions from Multiple Equations

rmcelreath / rethinking

Statistical Rethinking course and book package

2.1k stars 596 forks source link

So, I'm working with rethinking in a Structural Equation Modeling framework. It works beautifully, but, I'm having some issues in generating posterior predictions. Namely, effects of linked relationships don't seem to translate. Consider the following model

x1 -> y1 -> y2

where each coefficient is a slope of 2.

I can create a nice fake data frame

set.seed(31415)
ndf <- data.frame(x1=runif(100,0,100))
ndf$y1 <- rnorm(100, ndf$x1*2, 40)
ndf$y2 <- rnorm(100, ndf$y1*2, 80)

And then a model that fits nicely using map2stan

nf <- alist(
  y1 ~ dnorm(yhat1, sigma1),
  y2 ~ dnorm(yhat2, sigma2),

  yhat1 <-  b1*x1,
  yhat2 <- b2*y1,

  b1 ~ dnorm( 0 , 1000 ),
  b2 ~ dnorm( 0 , 1000 ),

  sigma1 ~ dcauchy( 0 , 1 ),
  sigma2 ~ dcauchy( 0 , 1 )

)

n.fit <- map2stan( 
  nf , 
  data=ndf , 
  start=list(a1=1, a2=1, b1=1, b2=1, sigma1=10, sigma2=10)
)

here

> coef(n.fit)
        b1         b2     sigma1     sigma2 
  1.930119   1.965780  37.934121  86.538586

Great! But, when I try and do posterior prediction with x starting as 3, which should produce roughly 12 for y2...

> link(n.fit, data=list(x1=3)) -> a
[ 1000 / 1000 ] 200 / 1000 ]

> sapply(a, mean)
   yhat1    yhat2 
5.790357 1.965780

Huh....why is yhat incorrect here?

nf <- alist( y1 ~ dnorm(yhat1, sigma1), y2 ~ dnorm(yhat2, sigma2), yhat2pred <- b2*yhat1, yhat1 <- b1*x1, yhat2 <- b2*y1, b1 ~ dnorm( 0 , 10 ), b2 ~ dnorm( 0 , 10 ), sigma1 ~ dcauchy( 0 , 1 ), sigma2 ~ dcauchy( 0 , 1 ) ) n.fit <- map2stan( nf , data=ndf , start=list(b1=1, b2=1, sigma1=10, sigma2=10) ) a <- link(n.fit, data=list(x1=3)) sapply(a,mean)

rmcelreath / rethinking

Posterior Predictions from Multiple Equations #15