dgrtwo / ebbr

Empirical Bayes binomial estimation
Other
69 stars 13 forks source link

ebb_fit_prior error when using beta-binomial regression #13

Closed adamkski closed 2 years ago

adamkski commented 2 years ago

When debugging the error below it seems to be caused by the call parameters <- broom::tidy(fit) in ebb_fit_prior().

# recreating chapter 11 of David Robinson's Introduction to Empricial Bayes

library(Lahman)
#> Warning: package 'Lahman' was built under R version 4.1.3
library(ebbr)
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.1.2
#> Warning: package 'ggplot2' was built under R version 4.1.2
#> Warning: package 'tibble' was built under R version 4.1.2
#> Warning: package 'tidyr' was built under R version 4.1.2
#> Warning: package 'readr' was built under R version 4.1.2
#> Warning: package 'purrr' was built under R version 4.1.2
#> Warning: package 'dplyr' was built under R version 4.1.2
#> Warning: package 'stringr' was built under R version 4.1.2
#> Warning: package 'forcats' was built under R version 4.1.2
theme_set(theme_light())

# grab career batting average of non-pitchers
pitchers <- 
  Pitching %>% 
  group_by(playerID) %>% 
  summarise(gamesPitched = sum(G)) %>% 
  filter(gamesPitched > 3)

# add player names
player_names <- 
  People %>% 
  tibble %>% 
  select(playerID, nameFirst, nameLast, bats) %>% 
  unite(name, nameFirst, nameLast, sep = " ")

career_full <- 
  Batting %>% 
  filter(AB > 0) %>% 
  anti_join(pitchers, by = "playerID") %>% 
  group_by(playerID) %>% 
  summarise(H = sum(H), AB = sum(AB), year = mean(yearID)) %>% 
  inner_join(player_names, by = "playerID") %>% 
  filter(!is.na(bats))

career <- 
  career_full %>% 
  select(-bats, -year)

# solve this with beta-binomial regression
eb_career_ab <- 
  career %>% 
  ebb_fit_prior(H, AB, method = "gamlss",
                mu_predictors = ~ log10(AB))
#> Warning in summary.gamlss(x): summary: vcov has failed, option qr is used instead
#> ******************************************************************
#> Family:  c("BB", "Beta Binomial") 
#> 
#> Call:  
#> gamlss::gamlss(formula = form, family = fam, data = tbl, sigma.predictors = sigma_predictors) 
#> 
#> 
#> Fitting method: RS() 
#> 
#> ------------------------------------------------------------------
#> Mu link function:  logit
#> Mu Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) -1.694982   0.009005 -188.23   <2e-16 ***
#> log10(AB)    0.193192   0.002768   69.79   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> ------------------------------------------------------------------
#> Sigma link function:  log
#> Sigma Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  -6.3316     0.0225  -281.3   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> ------------------------------------------------------------------
#> No. of observations in the fit:  10240 
#> Degrees of Freedom for the fit:  3
#>       Residual Deg. of Freedom:  10237 
#>                       at cycle:  5 
#>  
#> Global Deviance:     74715.37 
#>             AIC:     74721.37 
#>             SBC:     74743.07 
#> ******************************************************************
#> Error: $ operator is invalid for atomic vectors

Created on 2022-05-31 by the reprex package (v2.0.1)

adamkski commented 2 years ago

Update: I've found a temporary work around suggested in https://github.com/tidymodels/broom/issues/938.

library(broom.mixed) at the top of the script fixes this.