stan-dev / projpred

Projection predictive variable selection
https://mc-stan.org/projpred/
Other
110 stars 26 forks source link

Model fit with custom distribution family? #418

Closed lauterbur closed 1 year ago

lauterbur commented 1 year ago

Hello,

Is it possible to use projpred on a model with a custom distribution family? I am trying to run some analyses using projpred on brms fit models using a custom family, but getting the following error:

Error in get(family$family, mode = "function") : 
  object 'custom' of mode 'function' was not found

I've set up the custom family in brms as follows:

skew_student_t <- custom_family(
  name = "skew_student_t", dpars = c("mu", "sigma", "nu", "alpha"),
  links = c("identity", "log", "logm1", "identity"), 
  lb = c(NA, 0, 1, NA),
  type = "real")

StanFuncs <- "

    real skew_student_t_lpdf(real y, real mu, real sigma, real nu, real alpha) {

    return log(2) + student_t_lpdf(y | nu, mu, sigma) + student_t_lcdf((y - mu)/sigma * alpha | nu,0, 1);

    }

    real skew_student_t_rng(real mu, real nu, real sigma, real alpha) {

    real z = skew_normal_rng(0, sigma, alpha);
    real v = chi_square_rng(nu)/nu;  
    real y = mu + z/sqrt(v);

    return y;

    }"
StanVars <- stanvar(scode = StanFuncs, block = "functions")

posterior_predict_skew_student_t <- function(i, prep, ...) {
  mu <- brms::get_dpar(prep, "mu", i = i)
  sigma <- brms::get_dpar(prep, "sigma", i = i)
  nu <- brms::get_dpar(prep, "nu", i = i)
  alpha <- brms::get_dpar(prep, "alpha", i = i)

  skew_student_t_rng(mu, nu, sigma, alpha)
}
log_lik_skew_student_t <- function(i, prep) {
  mu <- brms::get_dpar(prep, "mu", i = i)
  sigma <- brms::get_dpar(prep, "sigma", i = i)
  nu <- brms::get_dpar(prep, "nu", i = i)
  alpha <- brms::get_dpar(prep, "alpha", i = i)

  y <- prep$data$Y[i]
  skew_student_t_lpdf(y, mu, sigma, nu, alpha)
}
posterior_epred_skew_student_t <- function(prep) {
  if (!is.null(prep$ac$lagsar)) {
    prep$dpars$mu <- posterior_epred_lagsar(prep)
  }
  prep$dpars$mu
} 

And a test model using the example from get_refmodel.brmsfit:

testfit<- brm(count ~ zAge + zBase * Trt,
                            data = epilepsy, 
                            family = skew_student_t, 
                            stanvars=StanVars, 
                            cores=4)
expose_functions(testfit, vectorize = TRUE)

And this error occurs when using get_refmodel

get_refmodel(testfit)
Error in get(family$family, mode = "function") : 
  object 'custom' of mode 'function' was not found

Is there a way to get this to work with a custom family?

lauterbur commented 1 year ago

I see that this is actually more complicated under the hood than I realized and that implementing different families (at least for projections) requires careful thought (eg. https://github.com/stan-dev/projpred/issues/361), so this is likely to be impractical.

fweber144 commented 1 year ago

I see that this is actually more complicated under the hood than I realized and that implementing different families (at least for projections) requires careful thought (eg. #361), so this is likely to be impractical.

Yes, you're right. Out of the box, projpred only supports the families listed in https://mc-stan.org/projpred/articles/projpred.html#modtypes. However, the latent projection (vignette: https://mc-stan.org/projpred/articles/latent.html) can be used for more families and should also be applicable to custom families, but this is just a quick guess (I haven't tried it out). I'm closing this issue for now, but if you run into problems with the latent projection for your custom family, feel free to re-open.