tidymodels / broom

Convert statistical analysis objects from R into tidy format
https://broom.tidymodels.org
Other
1.46k stars 304 forks source link

Allow choosing the type of residuals in `tidy.betareg()` #1198

Closed DanChaltiel closed 6 months ago

DanChaltiel commented 7 months ago

Hi,

When fitting a beta regression using betareg::betareg(), you often have to choose a different type of residual as the default one may produce NaN.

For instance, you could consider the following reprex, taken from StackOverflow. The type argument in summary() is forwarded to residuals().

# here are the data
devtools::source_gist("169bfa3a6c709fd2fd31c5bfa46648ee")
library(betareg)
dieet$Percentage <- gsub(",",".",dieet$Percentage)
dieet$Percentage <- as.numeric(dieet$Percentage)
model.beta = betareg(Percentage ~ Kuikenweek,data = dieet)

summary(model.beta)
#> Warning in sqrt(v * (1 - hatvalues(object))): NaNs produced
#> Call:
#> betareg(formula = Percentage ~ Kuikenweek, data = dieet)
#> Standardized weighted residuals 2:
#> Error in quantile.default(x$residuals): missing values and NaN's not allowed if 'na.rm' is FALSE

summary(model.beta, type = "deviance")
#> 
#> Call:
#> betareg(formula = Percentage ~ Kuikenweek, data = dieet)
#> 
#> Deviance residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.6188 -0.4916  0.0198  0.4973  2.9149 
#> 
#> Coefficients (mean model with logit link):
#>                   Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)        -0.5872     0.3003  -1.955 0.050542 .  
#> KuikenweekWeek -2   0.1738     0.4208   0.413 0.679594    
#> [..]

tibble::tibble(
  pearson=residuals(model.beta, type = "pearson"),
  deviance=residuals(model.beta, type = "deviance"),
  response=residuals(model.beta, type = "response"),
  weighted=residuals(model.beta, type = "weighted"),
  sweighted=residuals(model.beta, type = "sweighted"),
  sweighted2=residuals(model.beta, type = "sweighted2")
)
#> Warning in sqrt(v * (1 - hatvalues(object))): NaNs produced
#> # A tibble: 36 × 6
#>    pearson deviance response weighted sweighted sweighted2
#>      <dbl>    <dbl>    <dbl>    <dbl>     <dbl>      <dbl>
#>  1  -0.739   -0.686  -0.120   -0.237    -0.686     -0.767 
#>  2   2.99     2.91    0.468    0.989     2.87       3.20  
#>  3   0.236    0.160   0.0383   0.0600    0.174      0.194 
#>  4  -0.484   -0.345  -0.0773  -0.129    -0.374     -0.419 
#>  5   0.683    0.373   0.0986   0.174     0.503      0.562 
#>  6   1.25     1.15    0.204    0.413     1.20       1.31  
#>  7  -0.501   -0.475  -0.0814  -0.164    -0.474     -0.530 
#>  8  -0.197   -0.213  -0.0320  -0.0734   -0.213     -0.238 
#>  9  -0.284   -0.148  -0.0453  -0.0635   -0.184     -0.206 
#> 10   0.283    0.216   0.0408   0.0258    0.0748     0.0836
#> # ℹ 26 more rows

Created on 2024-04-25 with reprex v2.1.0

Would it be possible to have a type argument in tidy.betareg() as well?

simonpcouch commented 7 months ago

Woah, TIL about devtools::source_gist()! That's slick.

Some notes-to-self:

I think I've convinced myself that this is in scope for the package's maintenance guidelines. I'd welcome a PR that passes dots to summary() in the betareg tidy method, documenting and testing doing so (one could model after https://github.com/tidymodels/broom/pull/1153). :)

simonpcouch commented 6 months ago

Closing re: discussion in https://github.com/tidymodels/broom/pull/1199#discussion_r1591505816.

github-actions[bot] commented 6 months ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.