Collinearity not available for models of class 'hurdle' #606

Closed MarcRieraDominguez closed 12 months ago

MarcRieraDominguez commented 1 year ago

Hi! The check_collinearity() function returns an error when applied to a model of class hurdle. The documentation of performance indicates that such models are supported. For now, a workaround could be to fit the hurdle model with glmmTMB, and get the variance inflation factors for that model. Congratulations on the great package!

data("bioChemists", package = "pscl")
#>       art            fem           mar           kid5             phd       
#>  Min.   : 0.000   Men  :494   Single :309   Min.   :0.0000   Min.   :0.755  
#>  1st Qu.: 0.000   Women:421   Married:606   1st Qu.:0.0000   1st Qu.:2.260  
#>  Median : 1.000                             Median :0.0000   Median :3.150  
#>  Mean   : 1.693                             Mean   :0.4951   Mean   :3.103  
#>  3rd Qu.: 2.000                             3rd Qu.:1.0000   3rd Qu.:3.920  
#>  Max.   :19.000                             Max.   :3.0000   Max.   :4.620  
#>       ment       
#>  Min.   : 0.000  
#>  1st Qu.: 3.000  
#>  Median : 6.000  
#>  Mean   : 8.767  
#>  3rd Qu.:12.000  
#>  Max.   :77.000

list.mod <- list(
  hurdle = pscl::hurdle(art ~ fem + mar, data = bioChemists, dist = "poisson", zero.dist = "binomial", link = "logit"),
  glmmtmb = glmmTMB::glmmTMB(art ~ fem + mar, data = bioChemists, family = truncated_poisson(), ziformula = ~ fem + mar)

lapply(list.mod, class)
#> $hurdle
#> [1] "hurdle"
#> $glmmtmb
#> [1] "glmmTMB"

lapply(list.mod, summary)
#> $hurdle
#> Call:
#> pscl::hurdle(formula = art ~ fem + mar, data = bioChemists, dist = "poisson", 
#>     zero.dist = "binomial", link = "logit")
#> Pearson residuals:
#>     Min      1Q  Median      3Q     Max 
#> -1.1392 -1.0178 -0.3446  0.3905 10.2843 
#> Count model coefficients (truncated poisson with log link):
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.847598   0.064084  13.226  < 2e-16 ***
#> femWomen    -0.237351   0.064199  -3.697 0.000218 ***
#> marMarried   0.008846   0.066944   0.132 0.894867    
#> Zero hurdle model coefficients (binomial with logit link):
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.90794    0.15640   5.805 6.42e-09 ***
#> femWomen    -0.24195    0.14923  -1.621    0.105    
#> marMarried   0.07802    0.15636   0.499    0.618    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
#> Number of iterations in BFGS optimization: 9 
#> Log-likelihood: -1670 on 6 Df
#> $glmmtmb
#>  Family: truncated_poisson  ( log )
#> Formula:          art ~ fem + mar
#> Zero inflation:       ~fem + mar
#> Data: bioChemists
#>      AIC      BIC   logLik deviance df.resid 
#>   3352.2   3381.1  -1670.1   3340.2      909 
#> Conditional model:
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.847585   0.064084  13.226  < 2e-16 ***
#> femWomen    -0.237339   0.064199  -3.697 0.000218 ***
#> marMarried   0.008862   0.066944   0.132 0.894684    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Zero-inflation model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) -0.90794    0.15640  -5.805 6.42e-09 ***
#> femWomen     0.24195    0.14923   1.621    0.105    
#> marMarried  -0.07802    0.15636  -0.499    0.618    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

performance::check_collinearity(list.mod$hurdle, component = "all")
#> Error in x$terms %||% attr(x, "terms") %||% stop("no terms component nor attribute"): no terms component nor attribute

#> # Check for Multicollinearity
#> * conditional component:
#> Low Correlation
#>  Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
#>   fem 1.06 [1.02, 1.20]         1.03      0.95     [0.83, 0.98]
#>   mar 1.06 [1.02, 1.20]         1.03      0.95     [0.83, 0.98]
#> * zero inflated component:
#> Low Correlation
#>  Term  VIF   VIF 95% CI Increased SE Tolerance Tolerance 95% CI
#>   fem 1.07 [1.02, 1.20]         1.03      0.94     [0.83, 0.98]
#>   mar 1.07 [1.02, 1.20]         1.03      0.94     [0.83, 0.98]

strengejacke commented 12 months ago

Thanks, should be fixed.

MarcRieraDominguez commented 12 months ago

Thanks to you for the awesome package!