should rvar auto-drop dimensions on subsetting?

wds15 commented 2 years ago

Hi!

rvars are a great thing and I have started using them a bit. I have run into issues when using it and there could be a nice solution but let me start with describing my setting. I wanted to transfer code from the transformed parameters block into R in order to eventually handle post-processing of my draws in R instead of Stan. Here are the respective Stan bits:

parameters {
  // the first 1:num_groups parameters are EX modelled while the
  // num_groups+1:2*num_groups are NEX
  vector[2] log_beta_raw[2*num_groups,num_comp];
  vector[num_inter] eta_raw[2*num_groups];

  // hierarchical priors
  vector[2] mu_log_beta[num_comp];
  // for differential discounting we allow the tau's to vary by
  // stratum (but not the means)
  vector<lower=0>[2] tau_log_beta_raw[num_strata,num_comp];
  cholesky_factor_corr[2] L_corr_log_beta[num_comp];

  vector[num_inter] mu_eta;
  vector<lower=0>[num_inter] tau_eta_raw[num_strata];
  cholesky_factor_corr[num_inter] L_corr_eta;
}
transformed parameters {
  vector[2] beta[2*num_groups,num_comp];
  vector[num_inter] eta[2*num_groups];
  vector<lower=0>[2] tau_log_beta[num_strata,num_comp];
  vector<lower=0>[num_inter] tau_eta[num_strata];

  // in the case of fixed tau's we fill them in here
  if (prior_tau_dist == 0) {
    tau_log_beta = prior_EX_tau_mean_comp;
    tau_eta = prior_EX_tau_mean_inter;
  } else {
    tau_log_beta = tau_log_beta_raw;
    tau_eta = tau_eta_raw;
  }

  // EX parameters which vary by stratum which is defined by the group
  for(g in 1:num_groups) {
    int s = group_stratum_cid[g];
    for(j in 1:num_comp) {
      beta[g,j] = mu_log_beta[j] +
        diag_pre_multiply(tau_log_beta[s,j], L_corr_log_beta[j]) * log_beta_raw[g,j];
    }
    if (num_inter > 0)
      eta[g] = mu_eta + diag_pre_multiply(tau_eta[s], L_corr_eta) * eta_raw[g];
  }

  // NEX parameters
  beta[num_groups+1:2*num_groups] = log_beta_raw[num_groups+1:2*num_groups];
  eta[num_groups+1:2*num_groups] = eta_raw[num_groups+1:2*num_groups];

  // exponentiate the slope parameter to force positivity
  for(g in 1:2*num_groups)
    for(j in 1:num_comp)
      beta[g,j,2] = exp(beta[g,j,2]);
}

I ended up with this respective rvar style code:

library(OncoBayes2)
library(posterior)

example_model("combo2")

post <- blrmfit$posterior

spost <- subset_draws(as_draws_rvars(post),
                      variable=c("log_beta_raw", "eta_raw", "mu_log_beta", "tau_log_beta_raw", "L_corr_log_beta", "mu_eta", "tau_eta_raw", "L_corr_eta"))

sdata <- blrmfit$standata

S_chain <- 1E3
num_chains <- 4
S <- S_chain * num_chains

beta <- rvar(array(0, dim=c(S, 2*sdata$num_groups,sdata$num_comp,2)), nchain=num_chains)
eta <- rvar(array(0, dim=c(S, 2*sdata$num_groups)), nchain=num_chains)
tau_log_beta <- rvar(array(0, dim=c(S, sdata$num_strata, sdata$num_comp)), nchain=num_chains)
tau_eta <- rvar(array(0, dim=c(S, sdata$num_strata)), nchain=num_chains)

## in the case of fixed tau's we fill them in here
if (sdata$prior_tau_dist == 0) {
    tau_log_beta = as_rvar(sdata$prior_EX_tau_mean_comp)
    tau_eta = as_rvar(sdata$prior_EX_tau_mean_inter)
} else {
    tau_log_beta = spost$tau_log_beta_raw
    tau_eta = spost$tau_eta_raw
}

## EX parameters which vary by stratum which is defined by the group
for(g in 1:sdata$num_groups) {
    s = sdata$group_stratum_cid[g];
    for(j in 1:sdata$num_comp) {
        beta[g,j,drop=TRUE] = spost$mu_log_beta[j,,drop=TRUE] + (rdo(diag(tau_log_beta[s,j,,drop=TRUE], sdata$num_comp, sdata$num_comp)) %**% spost$L_corr_log_beta[j,,drop=TRUE]) %**% spost$log_beta_raw[g,j,,drop=TRUE]
        ##diag_pre_multiply(tau_log_beta[s,j], L_corr_log_beta[j]) * log_beta_raw[g,j];
    }
    if (num_inter > 0)
        eta[g] = spost$mu_eta + (rdo(diag(tau_eta[s,drop=TRUE], sdata$num_inter, sdata$num_inter)) %**% spost$L_corr_eta) %**% spost$eta_raw[g]
    ##+ diag_pre_multiply(tau_eta[s], L_corr_eta) * eta_raw[g];
}

## NEX parameters
beta[sdata$num_groups+1:2*sdata$num_groups] = spost$log_beta_raw[sdata$num_groups+1:2*sdata$num_groups];
eta[sdata$num_groups+1:2*sdata$num_groups] = spost$eta_raw[sdata$num_groups+1:2*sdata$num_groups];

## exponentiate the slope parameter to force positivity
for(g in 1:2*sdata$num_groups)
    for(j in 1:sdata$num_comp)
        beta[g,j,2] = exp(beta[g,j,2]);

You see that the rvars code is not nice as I have to use many times drop. This is due to the fact that I nest matrices and vectors into arrays... but that structure gets cast into big arrays in rvar. However, when subsetting the leading indices (for the array stuff in Stan) and then using the rvar, the math won't work, since the dimensions do not line up - this is due to it's intended use is after subsetting things first to the array level.

Thus, I was wondering if we could have a as_draws_rvars flavour which creates nested lists from the samples? So all arrays get converted to nested lists and the vectors/matrices become rvars? With that I would not have to deal with drop in such an ugly way.

... or maybe there is a simpler way of doing this?

mjskay commented 2 years ago

You see that the rvars code is not nice as I have to use many times drop. This is due to the fact that I nest matrices and vectors into arrays... but that structure gets cast into big arrays in rvar. However, when subsetting the leading indices (for the array stuff in Stan) and then using the rvar, the math won't work, since the dimensions do not line up - this is due to it's intended use is after subsetting things first to the array level.

Yeah I see, it is kind of awkward. When we originally put the rvar interface together I was following the pattern from the nascent {rray} package of not auto-dropping array dimensions. One question might be if that was the right design decision after all? Maybe that actually makes real use more annoying. Not sure if other folks (@paul-buerkner @jgabry @avehtari) have thoughts about this? It is one of the big differences between rvars and base-R arrays; perhaps we should drop dimensions by default like base R does?

Thus, I was wondering if we could have a as_draws_rvars flavour which creates nested lists from the samples? So all arrays get converted to nested lists and the vectors/matrices become rvars? With that I would not have to deal with drop in such an ugly way.

I could see something like this being useful, even beyond this use case. Ragged arrays come to mind. Perhaps having a way to specifying dimensions to be created as lists when converted. This relates to some other issues: eventually I would like some generic machinery for parsing indices that can be used in other packages (there was some work on it here but it's been awhile and needs updating: https://github.com/stan-dev/posterior/pull/152).

beta[g,j,drop=TRUE] = spost$mu_log_beta[j,,drop=TRUE] + (rdo(diag(tau_log_beta[s,j,,drop=TRUE], sdata$num_comp, sdata$num_comp)) %**% spost$L_corr_log_beta[j,,drop=TRUE]) %**% spost$log_beta_raw[g,j,,drop=TRUE]

this usage reminds me we need a complete diag() implementation so you don't have to use rdo() here.

paul-buerkner commented 2 years ago

I think that both drop = FALSE and drop = TRUE have their merits, but I agree this use-case here is ugly. It would be interesting to see for what purpose other people use rvars (or why they don't use them). Does anyone wants to chime in?

In general, I am open to changing the drop default if we find enough evidence that this is the better choice in expectation.

stan-dev / posterior

should rvar auto-drop dimensions on subsetting? #245