stan-dev / projpred

Projection predictive variable selection
https://mc-stan.org/projpred/
Other
110 stars 26 forks source link

For a stan_glm model cv_varsel with loo works, but kfold gives an error #460

Closed avehtari closed 9 months ago

avehtari commented 9 months ago

Using projpred_2.7.0 and running the case study https://avehtari.github.io/modelselection/winequality-red.html works with

fitg_cv <- cv_varsel(fitg, method='forward', cv_method='loo')

but with

fitg_cv <- cv_varsel(fitg, method='forward', cv_method='kfold')

there is an error

Fitting model 1 out of 5
Error in as.character(x) : 
  cannot coerce type 'closure' to vector of type 'character'
fweber144 commented 9 months ago

Thanks, I'll take a look at this

fweber144 commented 9 months ago

This seems to be an rstanarm issue: Line https://github.com/stan-dev/rstanarm/blob/92a877c983a5e7f3817055338d2a81defab41ad0/R/loo-kfold.R#L186 throws the error for k = 1. I'll try to figure out what exactly is going wrong in rstanarm.

fweber144 commented 9 months ago

Ok, I think I found the cause for this issue: The case study's rstanarm::stan_glm() call creating the fitg object uses an object called formula as input for argument formula (see lines https://github.com/avehtari/modelselection/blob/ad0e10ce7d3a47de560609db7abe83777eed28bc/winequality-red.Rmd#L67-L68). This causes rstanarm (at least in version 2.26.1 which I am using; this has probably changed compared to the rstanarm version you were using when you wrote the original case study) to use formula as a symbol (essentially an unevaluated R expression consisting of an object name) for the model's formula when refitting the model K times within rstanarm:::kfold.stanreg(). Since there is the stats::formula() function (the closure mentioned in the error message), the formula symbol used internally by rstanarm seems to refer to stats::formula(), not the formula object you are using for the original rstanarm::stan_glm() call. So a solution would be to rename object formula created in line https://github.com/avehtari/modelselection/blob/ad0e10ce7d3a47de560609db7abe83777eed28bc/winequality-red.Rmd#L67 to something other than formula (e.g., formula_object) and to use that new name in lines https://github.com/avehtari/modelselection/blob/ad0e10ce7d3a47de560609db7abe83777eed28bc/winequality-red.Rmd#L68 and https://github.com/avehtari/modelselection/blob/ad0e10ce7d3a47de560609db7abe83777eed28bc/winequality-red.Rmd#L138.

Btw, I also had to prepend ggplot2:: to theme_set() in line https://github.com/avehtari/modelselection/blob/ad0e10ce7d3a47de560609db7abe83777eed28bc/winequality-red.Rmd#L30; otherwise, theme_set() couldn't be found.

avehtari commented 9 months ago

Thanks!