easystats / performance

:muscle: Models' quality and performance metrics (R2, ICC, LOO, AIC, BF, ...)
https://easystats.github.io/performance/
GNU General Public License v3.0
1.03k stars 94 forks source link

check_model for GAMs #405

Open DominiqueMakowski opened 2 years ago

DominiqueMakowski commented 2 years ago

Currently it's not supported, but some specific metrics could be implemented (e.g., for checking the number of knots k):

m1 <- mgcv::gam(Sepal.Length ~ s(Petal.Length, k = 3), data = iris)
m2 <- mgcv::gam(Sepal.Length ~ s(Petal.Length, k = 10), data = iris)

performance::check_model(m1)
#> Error: $ operator is invalid for atomic vectors

# Visual checks
mgcv::gam.check(m1)  # mgcv::qq.gam(m1)

#> 
#> Method: GCV   Optimizer: magic
#> Smoothing parameter selection converged after 4 iterations.
#> The RMS GCV score gradient at convergence was 6.533645e-07 .
#> The Hessian was positive definite.
#> Model rank =  3 / 3 
#> 
#> Basis dimension (k) checking results. Low p-value (k-index<1) may
#> indicate that k is too low, especially if edf is close to k'.
#> 
#>                   k'  edf k-index p-value  
#> s(Petal.Length) 2.00 1.97    0.87   0.045 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    # The test is implemented in k.check
    # Low p-values may indicate that the basis dimension, k, has been set too low, especially if the reported edf is close to k', the maximum possible EDF for the term.
    mgcv::k.check(m1)
    #>                 k'      edf   k-index p-value
    #> s(Petal.Length)  2 1.973634 0.8679672  0.0375
    mgcv::k.check(m2)
    #>                 k'      edf   k-index p-value
    #> s(Petal.Length)  9 6.095448 0.9578142  0.2675

Created on 2022-03-18 by the reprex package (v2.0.1)

DominiqueMakowski commented 2 years ago

brms also unsuported

m1 <- brms::brm(Sepal.Length ~ s(Petal.Length, k = 3), data = iris, algorithm = "meanfield", refresh = 0)
#> Compiling Stan program...
#> Start sampling
#> Warning: Pareto k diagnostic value is 1.05. Resampling is disabled. Decreasing
#> tol_rel_obj may help if variational algorithm has terminated prematurely.
#> Otherwise consider using sampling instead.

performance::check_model(m1)
#> Error: Model could not be automatically converted to frequentist model.

Created on 2022-03-18 by the reprex package (v2.0.1)

DominiqueMakowski commented 2 years ago

I'll take a look at mgcv::k.check() and see if it could be implemented and generalized in a check_k, check_smooth or check_gam function

DominiqueMakowski commented 2 years ago

For brms, it is basically implementing that: https://stackoverflow.com/questions/70704912/mgcvk-check-equivalent-for-gam-in-brms