adibender / pammtools

Piece-wise exponential Additive Mixed Modeling tools
https://adibender.github.io/pammtools/
Other
47 stars 11 forks source link

Competing Risks: Wrap glms/gams? #107

Closed pkopper closed 4 years ago

pkopper commented 5 years ago

In order to facilitate competing risks modeling, we may need to alter the classical approach of how PEMs/PAMMs are modeled. We will most likely need to stratify the data and separately model the competing risks. This means that a simple call via glm / gam is prohibitive. Hence, we may have to create new (S3) classes and methods to maintain the same front end as currently existing.

fabian-s commented 5 years ago

see also remarks at #78, #88.

adibender commented 5 years ago

But is it just a data transformation issue after which glm/gam can be called directly or do we need to call multiple glms/gams, etc.? Would be good to have one simple example analysis that shows general workflow before we decide how to implement.

pkopper commented 5 years ago

I think if we want to model gams/glms over cont. time for competing risks we need to model the (generalised additive) poisson regression separately for any risk. I think this implies that we will not be able to use gam() / glm() as we used to unless we (explicitly) model every covariate in the model with an interaction term. This, however, will either require a very long input to gam() / glm() or a modification of these functions as well.

I will present an example within the next days to illustrate this.

pkopper commented 5 years ago

I prepared a high-level design to illustrate what I was talking about.

#' glm method for ped_cr objects
#' 
#' This function serves as a wrapper for the glm function to be applicable 
#' to competing risks. Competing risks are separately modeled in their own
#' glm. The results can be interpreted synoptically, though, when using
#' the corresponding methods (summary etc.)
#' 
#' @param 
#' ...
#' @return a list of glms - one entry for a single competing risk.

glm.ped_cr <- function(formula, family = gaussian, data, weights, subset,
                       na.action, start = NULL, etastart, mustart, offset,
                       control = list(...), model = TRUE, method = "glm.fit",
                       x = FALSE, y = TRUE, singular.ok = TRUE, 
                       contrasts = NULL, ...) {
  crs <- unique(data$ped_status)
  crs <- crs[!(crs == 0)]
  n_crs <- length(crs)
  res <- vector(mode = "list", length = n_crs)
  for (i in 1:n_crs) {
    # this function is supposed to make a ped_cr object to a ped object
    # where we only investiagte one of the competing risks
    current_data <- modify_cr_data(data, cr = crs[i])
    res[[i]] <- glm(formula, family = gaussian, data = current_data, weights, 
                    subset, na.action, start = NULL, etastart, mustart, offset,
                    control = list(...), model = TRUE, method = "glm.fit",
                    x = FALSE, y = TRUE, singular.ok = TRUE, 
                    contrasts = NULL, ...)
  }
  class(res) <- "pem_cr"
  #for methods
  return(res)
}
adibender commented 5 years ago

If I understand correctly, this will return a list of cause specific hazard models? How do we proceed from there?

I think it would be helpful to use one simple real or simulated data set and walk through the full analysis, model estimation, interpretation etc. (maybe one of the examples in the Competing risks book) so we can see the full stack of things to implement/think about.

Added benefit: We can immediately see if the implemented methods produce equivalent/similar results

pkopper commented 5 years ago

This is pretty much what I thought about. From here on we would have to also create specific methods in order to facilitate the "standard" survival analysis features.

Ok, will do!

adibender commented 4 years ago

Closed via #156