Small bugs in survHE - Githubissues

Philip-Cooney commented 2 years ago

I just noticed three small bugs in the package that can be fixed with the code below:

If you try and print out the Gamma model it will not work because of a typo.

rescale_stats_hmc_gam <- function (table, x){
  rate <- matrix(table[grep("rate", rownames(table)),], ncol = 4)
  rownames(rate) <- "rate"
  shape <- matrix(table[grep("alpha", rownames(table)), 
  ], ncol = 4)
  rownames(shape) <- "shape"
  effects = add_effects_hmc(table, x) #typo in this function
  res <- rbind(shape, rate, effects)
  if (is.null(dim(res))) {
    names(res) <- c("mean", "se", "L95%", 
                    "U95%")
  }
  else {
    colnames(res) <- c("mean", "se", "L95%", 
                       "U95%")
  }
  return(res)
}

tmpfun <- get("rescale_stats_hmc_gam", envir = asNamespace("survHE"))
environment(rescale_stats_hmc_gam) <- environment(tmpfun)
attributes(rescale_stats_hmc_gam) <- attributes(tmpfun)  
assignInNamespace("rescale_stats_hmc_gam", rescale_stats_hmc_gam, ns="survHE")

Secondly the Royston-Parmar model does not display correctly when no covariates are included for the reason indicated in the code.

get_stats_hmc <- function (x, mod){

    table = rstan::summary(x$models[[mod]])$summary[, c("mean", 
                                                        "sd", "2.5%", "97.5%")]
    table = table[-grep("lp__", rownames(table)), ]

  if ("X_obs" %in% names(x$misc$data.stan[[1]])) {
    if (any(apply(x$misc$data.stan[[1]]$X_obs, 2, function(x) all(x == 0)))) {
      #Error (mod instead of 1)
      #for RPS X matrix can be 0 for both columns
      beta_drop <- which(apply(x$misc$data.stan[[mod]]$X, 2, function(x) all(x == 0)) == TRUE)
      beta_drop <- paste0("beta\\[", beta_drop,"\\]")
      if(length(beta_drop)>1){
        beta_drop <- paste(beta_drop, collapse = "|")
      }
      table <-  table[-grep(beta_drop, rownames(table)), ]
    }
  }
  else {
    if (any(apply(x$misc$data.stan[[1]]$X, 2, function(x) all(x == 0)))) {
      beta_drop <- which(apply(x$misc$data.stan[[mod]]$X, 2, function(x) all(x == 0)) == TRUE)
      beta_drop <- paste0("beta\\[", beta_drop,"\\]")
      if(length(beta_drop)>1){
        beta_drop <- paste(beta_drop, collapse = "|")
      }
       table <-table[-grep(beta_drop, rownames(table)), ]
    }
  }
  res = do.call(paste0("rescale_stats_hmc_", x$misc$model_name[mod]), 
                args = list(table = table, x = x))
  return(res)
}

tmpfun <- get("get_stats_hmc", envir = asNamespace("survHE"))
environment(get_stats_hmc) <- environment(tmpfun)
attributes(get_stats_hmc) <- attributes(tmpfun)  
assignInNamespace("get_stats_hmc", get_stats_hmc, ns="survHE")

Another issue with the Royston-Parmar with no covariates is that it will not plot the survival due to an error in this code:

make_sim_hmc <- function (m, t, X, nsim, newdata, dist, summary_stat, ...){

    iter_stan <- m@stan_args[[1]]$iter
    beta = rstan::extract(m)$beta

  if (ncol(X) == 1) {
    beta = beta[, 1, drop = F] #PC: add drop to stop it coercing to a vector
  }
  if (dist == "rps" & any(grepl("Intercept", colnames(X)))) {
    X <- as.matrix(as_tibble(X) %>% select(-`(Intercept)`))
    beta = beta[, -ncol(beta)]
  }
  linpred <- beta %*% t(X)
  sim <- lapply(1:nrow(X), function(x) {
    do.call(paste0("rescale_hmc_", dist), args = list(m, 
                                                      X, linpred[, x]))
  })
  if (nsim > iter_stan) {
    stop("Please select a value for 'nsim' that is less than or equal to the value set in the call to 'fit.models'")
  }
  if (nsim == 1) {
    sim <- lapply(sim, function(x) as.matrix(as_tibble(x) %>% 
                                               summarise_all(summary_stat), nrow = 1, ncol = ncol(x)))
  }
  if (nsim > 1 & nsim < iter_stan) {
    sim <- lapply(sim, function(x) as.matrix(as_tibble(x) %>% 
                                               sample_n(nsim, replace = FALSE), nrow = nsim, ncol = ncol(x)))
  }
  return(sim)
}

giabaio commented 2 years ago

Thank you, @Philip-Cooney. Which version of survHE are you on? Also, to simplify/speed up things, can you point out more directly to what issues you have encountered (and what error message has been thrown out)? I'll look into these ASAP, but it'd help to focus on exactly what is wrong and your suggested solution...

Thanks

Philip-Cooney commented 2 years ago

Hi @giabaio. I am on version 1.1.2 (CRAN).

Hopefully this makes things clearer.

Let me know if you require any further clarifications.

Philip

packageVersion("survHE")
#‘1.1.2’

mods <- c( "gamma",   "rps")
formula <- Surv(time, censored) ~ 1
m1 <- fit.models(formula = formula, data = data, distr = mods, k = 1, method = "hmc")

#Issue 1: Not an error actually; gamma doesn't display correctly; 3 rows instead of 2
print(m1, mod = 1)

#Issue 2: Not an error actually; rps doesn't display correctly; 4 rows instead of 3
print(m1, mod = 2)
#compare with flexsurvreg
flexsurv::flexsurvspline(formula, data = data, k = 1)

#issue 3: Actual error "Error in -ncol(beta) : invalid argument to unary operator"
plot(m1)

giabaio commented 2 years ago

Thank you, @Philip-Cooney. I think these have been fixed now in the devel version. Note that there's been a major restructuring of the package, so that now, the CRAN version of survHE is in fact split in three packages. You can get the main structure and all the basic functions by installing the devel branch

remotes::install_github("giabaio/survHE",ref="devel")

This basically loads up the main frequentist facilities (based on flexsurv) and the functions that allows the computation to be extended to either INLA or HMC. However, in order to be able to run Bayesian models, you need to separately install either or both the two other branches (ref="hmc" and ref="inla"), which you can do using, for instance

remotes::install_github("giabaio/survHE",ref="hmc")

Installing the hmc branch is a lot longer than the simplest devel (and, in fact, the inla) branch, so this helps streamline and simplify the package.

If you install devel and hmc, the code below

library(survHE)
mods <- c( "gamma",   "rps")
formula <- Surv(time, censored) ~ 1
m1 <- fit.models(formula = formula, data = data, distr = mods, k = 1, method = "hmc")

loads up in the background the hmc module (which contains the functions to run Stan in the background). Then

print(m1,mod=1)

gives

Model fit for the Gamma model, obtained using Stan (Bayesian inference via 
Hamiltonian Monte Carlo). Running time: 6.5852 seconds

         mean        se     L95%     U95%
shape 2.30141 0.2234237 1.890047 2.769696
rate  0.20283 0.0263161 0.155552 0.258184

Model fitting summaries
Akaike Information Criterion (AIC)....: 1218.476
Bayesian Information Criterion (BIC)..: 1230.192
Deviance Information Criterion (DIC)..: 1216.601

and

print(m1,mod=2)

gives

Model fit for the Royston & Parmar splines model, obtained using Stan (Bayesian inference via 
Hamiltonian Monte Carlo). Running time: 35.426 seconds

             mean        se      L95%      U95%
gamma0 -4.7372313 0.4008544 -5.568327 -3.996486
gamma1  2.1074918 0.3412200  1.483785  2.799519
gamma2  0.0550641 0.0519523 -0.043094  0.154634

Model fitting summaries
Akaike Information Criterion (AIC)....: 1218.381
Bayesian Information Criterion (BIC)..: 1230.097
Deviance Information Criterion (DIC)..: 1238.130

--- so both work now. If you also do

plot(m1)

you get the correct graph so all these issues should have been fixed now!

Thanks for pointing them out and please do shout if anything is unclear/you find other bugs!

Philip-Cooney commented 2 years ago

Great, Thank for the clarification @giabaio. Philip

giabaio / survHE

Small bugs in survHE #45