Genentech / psborrow2

psborrow2: Bayesian Dynamic Borrowing Simulation Study and Analysis
https://genentech.github.io/psborrow2/
Other
16 stars 2 forks source link

223 piecewise exponential #296

Open gravesti opened 4 months ago

gravesti commented 4 months ago

Proof of concept for an implementation of piecewise constant (exponential) hazard.

Had to do some messy things in make_model_string_parameters.R, so that needs some thinking


# Non Bayesian piecewise model
library(eha)
pem_eha <- eha::pchreg(Surv(time, status) ~ trt + cov1 + cov2, data = as.data.frame(psborrow2::example_matrix), cuts = c(0, 10, 20, 30, 40, 100))
summary(pem_eha)
pem_eha$convergence
pem_eha$hazards

# piecewise exponential with no borrowing 
pem_psb2 <- create_analysis_obj(
  data_matrix = psborrow2::example_matrix,
  outcome = outcome_surv_piecewise_exp(time_var = "time", cens_var = "cnsr", start_times = seq(0, 40, by = 10),
                                       baseline_prior = prior_normal(0, 1000)),
  borrowing = borrowing_full(ext_flag_col = "ext"),
  covariates = add_covariates(c("cov1", "cov2"), priors = prior_normal(0, 1000)),
  treatment = treatment_details("trt", prior_normal(0, 1000)),
)

samples <- psborrow2::mcmc_sample(pem_psb2, parallel_chains = 4, verbose = TRUE, iter_warmup = 5000)

## results seem to match
samples
log(pem_eha$hazards)
summary(pem_eha)
exp(coef(pem_eha))

  # Commensurate Borrowing model
  pem_psb3 <- create_analysis_obj(
    data_matrix = psborrow2::example_matrix,
    outcome = outcome_surv_piecewise_exp(time_var = "time", cens_var = "cnsr", start_times = seq(0, 40, by = 10),
                                         baseline_prior = prior_normal(0, 1000)),
    borrowing = borrowing_hierarchical_commensurate(ext_flag_col = "ext", tau_prior = prior_gamma(0.0001, 0.0001)),
    covariates = add_covariates(c("cov1", "cov2"), priors = prior_normal(0, 1000)),
    treatment = treatment_details("trt", prior_normal(0, 1000))
  )

  samples3 <- mcmc_sample(pem_psb3, parallel_chains = 4, verbose = TRUE, iter_warmup = 5000)

  print(samples3, max_rows = 20)

# compare against simple exponential model with commensurate borrowing
  exp_psb3 <- create_analysis_obj(
    data_matrix = psborrow2::example_matrix,
    outcome = outcome_surv_exponential(time_var = "time", cens_var = "cnsr", baseline_prior = prior_normal(0, 1000)),
    borrowing = borrowing_hierarchical_commensurate(ext_flag_col = "ext", tau_prior = prior_gamma(0.0001, 0.0001)),
    covariates = add_covariates(c("cov1", "cov2"), priors = prior_normal(0, 1000)),
    treatment = treatment_details("trt", prior_normal(0, 1000))
  )
  samples4 <- mcmc_sample(exp_psb3, parallel_chains = 4, verbose = TRUE, iter_warmup = 5000)

print(samples3, max_rows = 20)
samples4

Need to think about adding option for weight_matrix. My naive thought is it can be added simply in the likelihood line if W[N,M], ie

target += sum((lp .* D .* W) - (exp(lp) .* T .* W))
mattsecrest commented 4 months ago

@gravesti this looks awesome and it runs and I agree it looks similar. The code is quite clever and I am going to need a few days for a proper review as I do my matrix multiplication! Some questions in the meantime:

  1. I was going to do more of this in R but your make_durations is probably a better solution, though would be curious to benchmark. Do you trust glue with floats? Have not dug into yet but I suspect better to pass floats as data to cmdstanr than via glue for vector[M] starts = [0.00123, ... 40.99199]. Does gluehave a global option for decimal separators, for instance?
  2. Maybe easier to read if you declare row_vector[M] starts = [1,2,3,4,5] instead of using the transpose ' operator
  3. It occurs to me that it is not problematic to have vectors with 0 events b/c it's a fully parametric model and these parameters have priors and are not referenced in any likelihood increment. Do you agree? Obvi this is an early draft but some other thoughts for later: 1) not super clear how events outside of range of cutpoints will e handled; 2) should we allow different priors within diffferent cutpoints? 3) weights should accepted as matrix (1-row-per-pt-per-period) or R vector (1-row-per-pt), right?
mattsecrest commented 4 months ago

@gravesti just tried out the approach I had in mind with mroe preprocessing in R. It gave identical results but was about 75% the sampling time.

data {
   int<lower=0> N;
   vector[N] trt;
   vector[N] time;
   vector[N] cens;
   array[N] int ai;
   int<lower=0> K;
   matrix[N, K] X;
   vector[K] L_beta;
   vector[K] U_beta;
}

parameters {
   real beta_trt;
   vector<lower=L_beta, upper=U_beta>[K] beta;
   vector[5] alpha;
}

transformed parameters {
  real HR_trt = exp(beta_trt);
}

model {

   beta_trt ~ normal(0, 1000); 
   beta[1] ~ normal(0, 1000) ;
   beta[2] ~ normal(0, 1000) ;
   alpha ~ normal(0, 1000) ;
   for (i in 1:N) {
      if (cens[i] == 1) {
         target += exponential_lccdf(time[i] | exp(alpha[ai[i]] + X[i]*beta + trt[i] * beta_trt) );
      } else {
         target += exponential_lpdf(time[i] | exp(alpha[ai[i]] + X[i]*beta + trt[i] * beta_trt)  );
      }
   }
}

Here, N is the data in long format such as from survSplit and ai is the index of the alpha that the observation corresponds to.

mattsecrest commented 4 months ago
  id ext trt cov4 cov3 cov2 cov1 status cnsr resp tstart      time evnt ai
1  1   0   0    1    1    1    0      1    0    1      0  2.422641    1  1
2  2   0   0    1    1    0    1      0    1    1      0 10.000000    0  1
3  2   0   0    1    1    0    1      0    1    1     10 20.000000    0  2
4  2   0   0    1    1    0    1      0    1    1     20 30.000000    0  3
5  2   0   0    1    1    0    1      0    1    1     30 40.000000    0  4
mattsecrest commented 4 months ago

Bonus is also that I think it would be easier to work into the current STAN code?

gravesti commented 4 months ago

@gravesti this looks awesome and it runs and I agree it looks similar. The code is quite clever and I am going to need a few days for a proper review as I do my matrix multiplication!

A nice guide for the maths https://grodri.github.io/glms/notes/c7s4

Some questions in the meantime:

  1. I was going to do more of this in R but your make_durations is probably a better solution, though would be curious to benchmark. Do you trust glue with floats? Have not dug into yet but I suspect better to pass floats as data to cmdstanr than via glue for vector[M] starts = [0.00123, ... 40.99199]. Does gluehave a global option for decimal separators, for instance?

Good idea to pass as data. I guess I was trying to touch the least other functions. I'm sure we can control how we go from numeric to character, either via glue or printf.

  1. Maybe easier to read if you declare row_vector[M] starts = [1,2,3,4,5] instead of using the transpose ' operator I don't know why I did it as a vector, so we can try to change.

  2. It occurs to me that it is not problematic to have vectors with 0 events b/c it's a fully parametric model and these parameters have priors and are not referenced in any likelihood increment. Do you agree?

Sounds right to me.

Obvi this is an early draft but some other thoughts for later: 1) not super clear how events outside of range of cutpoints will e handled;

In my set up we only have to set the start of the periods, so we only miss events which happen too early. I guess in most data we don't expect to have times less than 0, so we could always have a period starting at 0.

2) should we allow different priors within diffferent cutpoints?

Yeah, I was thinking about this already. We might already have this for easily with stan notation, I think it's possible to specify eg a normal prior with a vector of means, so that all the alphas could have their own mean. This needs more thought when to comes to the borrowing method as well.

3) weights should accepted as matrix (1-row-per-pt-per-period) or R vector (1-row-per-pt), right?

In our current set up we specify weights as a column name, so we'd need new machinery for adding the weight matrix.

Re: long format and pre-processing in R I suppose then we'd need to introduce a slot for an R preprocessing function in one of the objects. I guess it belongs to the outcome object? My idea for doing it all in Stan was to keep the data the same between all models. I don't know if that really saves us much memory