`posterior_survfit` should return covariate data alongside the predictions

Summary:

For stan_surv models the id column in the data frame returned by posterior_survfit() represents the row of the covariate data that the predictions "belong" to. But the id column can be confusing, especially when the name id was used for one of the variables (e.g. the grouping factor) in the model itself.

The more explicit thing might be to return all of the covariate data used in the predictions as part of the data returned by posterior_survfit; that way the user knows exactly the covariate values that the row of predictions belongs to. Rather than having to infer it based on some "id" (i.e. row identifier) column.

This is essentially what tripped up one user here.

Reproducible Steps:

Here we fit a model with a grouping factor called id with values 5:10, then we predict survival probabilities for the groups that had id values 5 and 6

library(rstanarm)

data <- data.frame(
    trt = rep(c("A", "B"), each = 15),
    id = rep(5:10, each = 5),
    status = sample(0:1, 30, replace = TRUE),
    days = sample(10, 30, replace = TRUE))

mod = stan_surv(
    formula = Surv(days, status) ~ trt + (1 | id),
    data = data,
    basehaz = "weibull",
    chains = 1,
    iter = 20,
    cores = 1)

posterior_survfit(
    mod, 
    type = "surv",
    newdata = data[1:10,],
    times = 1,
    extrapolate = FALSE)

which returns:

stan_surv predictions
 num. individuals: 10 
 prediction type:  event free probability 
 standardised?:    no 
 conditional?:     no 

   id cond_time   time median  ci_lb  ci_ub
1   1        NA 1.0000 0.9876 0.9716 0.9950
2   2        NA 1.0000 0.9876 0.9716 0.9950
3   3        NA 1.0000 0.9876 0.9716 0.9950
4   4        NA 1.0000 0.9876 0.9716 0.9950
5   5        NA 1.0000 0.9876 0.9716 0.9950
6   6        NA 1.0000 0.9852 0.9684 0.9936
7   7        NA 1.0000 0.9852 0.9684 0.9936
8   8        NA 1.0000 0.9852 0.9684 0.9936
9   9        NA 1.0000 0.9852 0.9684 0.9936
10 10        NA 1.0000 0.9852 0.9684 0.9936

but we see that the id column in the predictions has values 1 through 10. This is because it is referring to rows 1 through 10 of the prediction covariate data, not the values for the id variable in the original model! It would be clearer / safer if we just returned all the covariates, including id, and then we wouldn't even need a row identifier.

stan-dev / rstanarm

`posterior_survfit` should return covariate data alongside the predictions #533

Summary:

Reproducible Steps: