paul-buerkner / brms

brms R package for Bayesian generalized multivariate non-linear multilevel models using Stan
https://paul-buerkner.github.io/brms/
GNU General Public License v2.0
1.27k stars 183 forks source link

Weibull distribution with proportional hazards parametrisation #1677

Open wds15 opened 1 month ago

wds15 commented 1 month ago

It would be great to have the Weibull distribution in a proportional hazards (PH) formulation in addition to the existing one which is an accelerated failure time model (when these models are used for time to event problems). PH is a very common model in Biostatistics and it would be great to have Weibull in both flavours.

Note that in this parametrisation the sign convention of the intercept flips wrt to the current Weibull - thus, this may need consideration.

Here is an example implementation (maybe one can add to the Weibull family boolean flag or something similar):

#' definition of weibull custom brms family using the proportional
#' hazard formulation as in Ibrahim, Chen, and Sinha (2001) in their
#' book
#' ["Bayesian Survival Analysis"](https://doi.org/10.1007/978-1-4757-3447-8)
#' (p. 14)
#' $$f(t) = \gamma t^{\alpha-1} \exp(-\gamma t^\alpha) $$
#'
#' This parametrization relates to the Stan Weibull density definition
#' with the transformation of $$\sigma := \gamma^{-1 / \alpha}$$. This
#' form is the same as used in the R versions for the weibull
#' distribution as implemented in the stats package.

#'
#' Code here has been first written by Bjoern Holzhauer and was
#' extended by Sebastian Weber to integrate more tightly with brms.
#' 

family_weibull_ph <- function(link_gamma="log", link_alpha="log") {
    brms::custom_family( 
              name = "weibull_ph",
              dpars = c("mu", "alpha"), # first param needs to be "mu" cannot be "gamma"; alpha is the shape
              links = c(link_gamma, link_alpha),
              lb = c(0, 0), 
              # ub = c(NA, NA), # would be redundant
              type = "real", # no need for `vars` like for `cens`, brms can handle this
              loop=TRUE
          )
}

sv_weibull_ph <- brms::stanvar(name="weibull_ph_stan_code",
                               scode = "
real weibull_ph_lpdf(real y, real mu, real alpha) {
  //real sigma = pow(1 / mu, 1 / alpha);
  real sigma = pow(mu, -1 * inv( alpha ));
  return weibull_lpdf(y | alpha, sigma);
}
real weibull_ph_lccdf(real y, real mu, real alpha) {
  real sigma = pow(mu, -1 * inv( alpha ));
  return weibull_lccdf(y | alpha, sigma);
}
real weibull_ph_lcdf(real y, real mu, real alpha) {
  real sigma = pow(mu, -1 * inv( alpha ));
  return weibull_lcdf(y | alpha, sigma);
}
real weibull_ph_rng(real mu, real alpha) {
  real sigma = pow(mu, -1 * inv( alpha ));
  return weibull_rng(alpha, sigma);
}
", block="functions")

## R definitions of auxilary helper functions of brms, these are based
## on the respective weibull (internal) brms implementations

log_lik_weibull_ph <- function(i, prep) {
    shape <- get_dpar(prep, "alpha", i = i)
    sigma <- get_dpar(prep, "mu", i = i)^(-1/shape)
    args <- list(shape = shape, scale = sigma)
    out <- brms:::log_lik_censor(dist = "weibull", args = args, i = i, 
        prep = prep)
    out <- brms:::log_lik_truncate(out, cdf = pweibull, args = args, 
        i = i, prep = prep)
    brms:::log_lik_weight(out, i = i, prep = prep)
}

posterior_predict_weibull_ph <- function(i, prep, ntrys = 5, ...) {
    shape <- get_dpar(prep, "alpha", i = i)
    sigma <- get_dpar(prep, "mu", i = i)^(-1/shape)
    brms:::rcontinuous(n = prep$ndraws, dist = "weibull", shape = shape, 
        scale = sigma, lb = prep$data$lb[i], ub = prep$data$ub[i], 
        ntrys = ntrys)    
}

posterior_epred_weibull_ph <- function(prep) {
    shape <- get_dpar(prep, "alpha")
    sigma <- get_dpar(prep, "mu")^(-1/shape)
    sigma * gamma(1 + 1/shape)
}

#'
#' example use
#' 
if(FALSE) {

    library(brms)
    ## quick test of basic functionality
    ovar <- survival::ovarian
    brmfit1 <- brm(bf(futime | cens(1-fustat) ~ 1,
                  alpha ~ 1,
                  family=family_weibull_ph()),
               data=ovar,
               prior = prior(class=Intercept, dpar="alpha", normal(0, 1)) + 
                 prior(class=Intercept, normal(-7.5, 10)),
               warmup = 1000,
               iter=5000,
               control = list(adapt_delta=0.99),
               stanvars = sv_weibull_ph,
               backend="cmdstanr",
               seed=456758,
               refresh=250)

    brmfit2 <- brm(bf(futime | cens(1-fustat) ~ 1,
                  shape ~ 1,
                  family=brmsfamily("weibull")),
               data=ovar,
               prior = prior(class=Intercept, dpar="shape", normal(0, 1)) + 
                 prior(class=Intercept, normal(7.5, 10)),
               warmup = 1000,
               iter=5000,
               control = list(adapt_delta=0.99),
               backend="cmdstanr",
               seed=456758,
               refresh=250)

    library(bayesplot)

    pp1 <- posterior_predict(brmfit1)
    pp2 <- posterior_predict(brmfit2)

    with(ovar, ppc_km_overlay(y=futime, yrep=pp1[1:100,], status_y=fustat)) + coord_cartesian(xlim=c(0, 1500))
    with(ovar, ppc_km_overlay(y=futime, yrep=pp2[1:100,], status_y=fustat)) + coord_cartesian(xlim=c(0, 1500))

    ## triggers call of log_lik
    loo1 <- loo(brmfit1)
    loo2 <- loo(brmfit2)
    loo::loo_compare(loo1, loo2)

    ep1 <- posterior_epred(brmfit1)
    rvar(ep1)
    ep2 <- posterior_epred(brmfit2)
    rvar(ep2)
}
paul-buerkner commented 1 month ago

Thank you! I will take a look at it.

bjoernholzhauer commented 1 week ago

In pharma a proportional hazards parameterization is the one used all the time, because people are used to having the effect coefficient for treatment being (log-)hazard ratios. So would be really good to have that as an option. There's also already another request.