stan-dev / projpred

Projection predictive variable selection
https://mc-stan.org/projpred/
Other
110 stars 26 forks source link

Error from multivariate brms models #437

Closed hrlai closed 1 year ago

hrlai commented 1 year ago

Hi again, here to report that cv_varsel doesn't work with multivariate brms models that are specified using the mvbind syntax. For example,

library(brms)
library(projpred)

data("df_gaussian", package = "projpred")
dat_gauss <- data.frame(y = df_gaussian$y, df_gaussian$x)

refm_fit <- brm(
    mvbind(y, X1) ~ X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 + X13 + X14 +
        X15 + X16 + X17 + X18 + X19 + X20,
    family = gaussian(),
    data = dat_gauss,
    ### Only for the sake of speed (not recommended in general):
    chains = 2, iter = 1000
)

cvvs_fast <- cv_varsel(
    refm_fit,
    validate_search = FALSE,
    nterms_max = 10
)

returns the error

Error in if (family$family == "bernoulli") { : argument is of length zero

which may be due to the family stored in the model now being a list object?

> refm_fit$family
$y

Family: gaussian 
Link function: identity 

$X1

Family: gaussian 
Link function: identity 
fweber144 commented 1 year ago

Hi @hrlai,

Thank you for creating this issue. Multivariate models are not supported yet in projpred. Nevertheless, the error message should be more informative. I thought that https://github.com/paul-buerkner/brms/blob/5c09251daabd5416e3d47004cc6c62816dc53cfa/R/projpred.R#L64 would catch such a case, but it seems it doesn't. @paul-buerkner: Would it make more sense to adapt brms:::validate_resp() or brms:::get_refmodel.brmsfit()?

paul-buerkner commented 1 year ago

I agree the error is confusing. I wonder why it is not picked up by validate_resp in the first place. @fweber144 do you see why?

fweber144 commented 1 year ago

I'm not sure if validate_resp() is intended to catch such a case where resp is NULL. If it is, then the bug might be that in https://github.com/paul-buerkner/brms/blob/97a584aef02d77ca3b2d81bc03050c3724754d40/R/brmsfit-helpers.R#L709-L711, argument multiple is ignored. Something like https://github.com/paul-buerkner/brms/blob/97a584aef02d77ca3b2d81bc03050c3724754d40/R/brmsfit-helpers.R#L706-L708 might then be necessary there as well (so effectively, https://github.com/paul-buerkner/brms/blob/97a584aef02d77ca3b2d81bc03050c3724754d40/R/brmsfit-helpers.R#L706-L708 would then have to be moved out of the if expression to the second-to-last line, i.e., directly before the return statement).

paul-buerkner commented 1 year ago

Okay, I will take a look. Thank you!

paul-buerkner commented 1 year ago

now fixed in github

fweber144 commented 1 year ago

Great, thank you!