dgrtwo / dgrtwo.github.com

My website
MIT License
237 stars 210 forks source link

Bayesian hierarchical modeling: mutate: non-numeric argument to binary operator #11

Open stuppie opened 5 years ago

stuppie commented 5 years ago

I'm having an issue running the example from "Understanding empirical Bayesian hierarchical modeling (using baseball statistics)" http://varianceexplained.org/r/hierarchical_bayes_baseball/

I run everything before and up to:

crossing(bats = c("L", "R"),
         AB = c(10, 100, 1000, 10000)) %>%
  augment(fit2, newdata = .) %>%
  mutate(H = .3 * AB,
         alpha0 = .fitted / sigma,
         beta0 = (1 - .fitted) / sigma,
         alpha1 = alpha0 + H,
         beta1 = beta0 + AB - H,
         estimate = alpha1 / (alpha1 + beta1),
         conf.low = qbeta(.025, alpha1, beta1),
         conf.high = qbeta(.975, alpha1, beta1),
         record = paste(H, AB, sep = " / ")) %>%
  ggplot(aes(estimate, record, color = bats)) +
  geom_point() +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high)) +
  labs(x = "Estimate w/ 95% credible interval",
       y = "Batting record",
       color = "Batting hand")

and get:

Error in mutate_impl(.data, dots) : 
  Evaluation error: non-numeric argument to binary operator.

I figured out the error is coming from the fact that sigma is a function, and so is causing the calculation of alpha0 to fail. I'm not too familiar with tidyr, and so am not sure how to pull out sigma for the predicted/fitted new data...

I appreciate your help in advance!

stuppie commented 5 years ago

I was able to recreate the figure with the following code:

fakedata = crossing(bats = c("L", "R"),
         AB = c(10, 100, 1000, 10000))

fakedata %>%
  mutate(H = .3 * AB,
         mu = predict(fit2, what = "mu", newdata = fakedata),
         sigma = predict(fit2, what = "sigma", newdata = fakedata, type = "response"),
         alpha0 = mu / sigma,
         beta0 = (1 - mu) / sigma,
         alpha1 = alpha0 + H,
         beta1 = beta0 + AB - H,
         estimate = alpha1 / (alpha1 + beta1),
         conf.low = qbeta(.025, alpha1, beta1),
         conf.high = qbeta(.975, alpha1, beta1),
         record = paste(H, AB, sep = " / ")
  ) %>%   ggplot(aes(estimate, record, color = bats)) +
  geom_point() +
  geom_errorbarh(aes(xmin = conf.low, xmax = conf.high)) +
  labs(x = "Estimate w/ 95% credible interval",
       y = "Batting record",
       color = "Batting hand")