gradlab / CtTrajectories

6 stars 3 forks source link

refactor Stan code for fit_posteriors_combined.stan #1

Open bob-carpenter opened 3 years ago

bob-carpenter commented 3 years ago

I refactored your Stan model, but didn't bother to create a PR because I didn't know if the model was still live. I made a bunch of changes:

  1. add space around operators
  2. convert real arrays to vectors to allow vector artithmetic,
  3. vectorized some calls with vector arithmetic, such as process_sd
  4. removed unnecessary elses in the function definition
  5. used declare/define syntax for one-liner transforms
  6. dropped unnecessary trunctations when the distribution parameters are constant and the variate is appropriately constrained in its declaration.
  7. removed the constant from log1m_exp because it factors out then just adds a constant to the target, which can be dropped
  8. add space before loop bounds (we recommend C style not R style)
  9. removed parens around returns (ditto)
  10. removed transformed data variables log_lambda and log1m_lambda, using log_mix function to simplify mixture (though the way you had it may be a bit more efficient)

I verified that it compiles, but didn't check that it works. The changes are all simple local transforms, but obviously one would want to verify that I did the algebra correctly.

What really needs to happen to make this more statistically efficient (in an effective sample size per iteration sense) is to use a non-centered parameterization for the hierarchical priors. That's tricky for positive-constrained parameters, but we're working on allowing <lower = 0, offset = mu, multiplier = sigma> to do that more easily. To do it without being able to combine the transforms requires a lot of manual labor, especially given the half-normal prior. Non-centering is easier with lognormal because then it's just the analog of the non-centered parameterization for unconstrained variables; as the model here has it, one would need to work out the implied location and scale of the half-normal variables on the unconstrained scale.

functions {
  /**
   * @return the delta Ct below the LOD
   */
  real mufun(real t, real tp, real wp, real wr, real dp){
    if (t > (tp - wp) && t <= tp)         // load rises between onset/peak
      return dp / wp * (t - (tp - wp));
    if (t > tp && t <= (tp + wr))         // load falls btw peak/recover
      return dp - dp / wr * (t - tp);
    return 0;                             // after recover

  }
}
data {
  real<lower = 0> lod;                   // limit of detection
  real<lower = 0> tpsd;                  // prior sd for the onset time (days)
  real<lower = 0> dpmean_prior;          // prior mean peak Ct (delta from lod)
  real<lower = 0> dpsd_prior;            // prior sd peak Ct (delta from lod)
  real<lower = 0> wpmax;                 // max proliferation duration
  real<lower = 0> wpmean_prior;          // prior mean proliferation duration
  real<lower = 0> wpsd_prior;            // prior sd proliferation duration
  real<lower = 0> wrmax;                 // prior max clearance duration
  real<lower = 0> wrmean_prior;          // prior mean clearance duration 
  real<lower = 0> wrsd_prior;            // prior sd clearance duration 
  real<lower = 0> sigma_max;             // max allowable value for observation noise
  real<lower = 0> sigma_prior_scale;     // prior observation noise Cauchy scale
  real<lower = 0, upper = 1> lambda;    // mixture probability (~1-sensitivity)
  real<lower = 0> fpmean;                // False positive mean Ct
  int<lower = 0> N;                      // number of concatenated data points
  vector<lower = 0, upper = lod>[N] y;  // observed data
  int<lower = 0> id[N];                  // identifier for data point
  real t[N];                             // time for data point
  vector<lower = 0>[N] epsilon;          // sd adjustment from Yale/FL regression
  int<lower = 0> n_id;                   // number of individuals 
  int<lower = 0> symp[n_id];             // symptomatic id for individuals
}
transformed data {
  vector<lower = 0, upper = lod>[N] ydrop = lod - y;  // deviation from LOD 

  // scale prior s.t. 90% of the mass is below max
  real dpcauchypriorscale = lod / tan(pi() * (0.95 - 0.5));
  real wpcauchypriorscale = wpmax / tan(pi() * (0.95 - 0.5));
  real wrcauchypriorscale = wrmax / tan(pi() * (0.95 - 0.5));
}
parameters {
  real<lower = 0, upper = lod> dpmean;        // population peak Ct drop mean
  real<lower = 0, upper = wpmax> wpmean;      // population onset-to-peak time mean
  real<lower = 0, upper = wrmax> wrmean;      // population peak-to-recovery time mean 
  real<lower = 0> dpsd;                        // population peak Ct drop sd
  real<lower = 0> wpsd;                        // population onset-to-peak time sd
  real<lower = 0> wrsd;                        // population peak-to-recovery time sd
  real tp[n_id];                               // peak time
  vector<lower = 0, upper = lod>[n_id] dp;    // peak Ct drop
  vector<lower = 0, upper = wpmax>[n_id] wp;  // onset-to-peak time
  vector<lower = 0, upper = wrmax>[n_id] wr;  // peak-to-recovery time 
  real<lower = 0, upper = sigma_max> sigma;   // process noise during infection
}
transformed parameters {
  vector[N] process_sd = sqrt(sigma^2 - epsilon .* epsilon);
  vector[N] mu;
  for(i in 1:N)
    mu[i] = mufun(t[i], tp[id[i]], wp[id[i]], wr[id[i]], dp[id[i]]);
}
model {
  // truncation required if distro parameters are Stan params
  dpmean ~ normal(dpmean_prior, dpsd_prior);
  wpmean ~ normal(wpmean_prior, wpsd_prior);
  wrmean ~ normal(wrmean_prior, wrsd_prior);
  dpsd ~ cauchy(0, dpcauchypriorscale) T[0, ];
  wpsd ~ cauchy(0, wpcauchypriorscale) T[0, ];
  wrsd ~ cauchy(0, wrcauchypriorscale) T[0, ];  
  sigma ~ cauchy(0, sigma_prior_scale);
  tp ~ normal(0, tpsd);
  for (n in 1:n_id) {  // Stan can't vectorize truncated distributions
    dp[n] ~ normal(dpmean, dpsd) T[0, lod];
    wp[n] ~ normal(wpmean, wpsd) T[0, wpmax];
    wr[n] ~ normal(wrmean, wrsd) T[0, wrmax];
  }    
  for (i in 1:N) {
    target += log_mix(lambda,
                      normal_lpdf(ydrop[i] | mu[i], process_sd[i]),
                      exponential_lpdf(ydrop[i] | 1/fpmean));
    target += -log_diff_exp(normal_lcdf(lod | mu[i], process_sd[i]),
                            normal_lcdf(0 | mu[i], process_sd[i]));
  }
}
bob-carpenter commented 3 years ago

I forgot to remove the T[0, ] from the Cauchy priors. Those aren't necessary because he prior scales are data variables and the variates are properly constrained. Removing the truncations avoids a call to the Cauchy cdf.