stan-dev / stan

Stan development repository. The master branch contains the current release. The develop branch contains the latest stable development. See the Developer Process Wiki for details.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
2.59k stars 368 forks source link

Allow _lupdf and _lupmf functions in the transformed parameters block #3094

Open paul-buerkner opened 2 years ago

paul-buerkner commented 2 years ago

Summary:

Allow _lupdf and _lupmf in the transformed parameters block.

Description:

For implementing prior sensitivity checks via power scaling (https://arxiv.org/abs/2107.14054; a paper together with @avehtari), we need to store the prior contributions to the log posterior. For this purpose, we currently define a quantity log_prior in the transformed parameters block to which we subsequently add _lpdf and _lpmf statements of the priors. This appears to work nicely. However, when I want to use the unnormalized versions _lupdf and _lupmf instead, I get a compiler error.

Reproducible Steps:

Try to compile

// generated with brms 2.16.4
functions {
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  real<lower=0> sigma;  // dispersion parameter
}
transformed parameters {
  real log_prior = 0;  // prior contributions to the log posterior
  // priors not including constants
  log_prior += student_t_lupdf(b | 5, 0, 10);
  log_prior += student_t_lupdf(Intercept | 3, 4, 4.4);
  log_prior += student_t_lupdf(sigma | 3, 0, 4.4);
}
model {
  // likelihood not including constants
  if (!prior_only) {
    target += normal_id_glm_lupdf(Y | Xc, Intercept, b, sigma);
  }
  // priors not including constants
  target += log_prior;
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
}

Current Output:

Compiler error:

Semantic error in 'string', line 47, column 15 to column 44:

Functions with names ending in _lupdf and _lupmf can only be used in the model block or user-defined functions with names ending in _lpdf or _lpmf.

Expected Output:

Stan compiles this model.

Additional Information:

Provide any additional information here.

Current Version:

v2.28.2

nhuurre commented 2 years ago

The compiler error is there for a reason; the _lupdf mechanism in Stan math library is tied to the autodiff and cannot work correctly outside of the model block. This issue was discussed and closed in stanc3 repo. Some context in this Discourse comment.

paul-buerkner commented 2 years ago

Thank you! That makes sense. Closing this issue.

bob-carpenter commented 2 years ago

the _lupdf mechanism in Stan math library is tied to the autodiff and cannot work correctly outside of the model block.

While it's tied to autodiff, that doesn't mean it can't work outside the model block. The _lupdf versions of functions drop terms in the log density that don't depend on autodiff variables (i.e., constants) when the template parameter propto is set to true. The value is true when the log_prob function is called for sampling or VI and false when called for optimization.

I've found it inconvenient that I can't use _lupdf in functions, which has led me to just use code duplication rather than functions in my Stan code.

The transformed parameters block is just like the model block right down to where the code is generated in the C++. That is, it's generated inside the context where the propto template parameter is available. In particular, I believe it should be possible to do this:

transformed parameters {
   real lp1 = normal_lupdf(y | mu, sigma);

}
model {
   target += lp1;

This would do exactly what I wanted it to do if you just generated _lupdf the same way as in the model block. The discussion on stanc3 just says this is impossible, but I think it should be possible.

rok-cesnovar commented 2 years ago

The transformed parameters block is just like the model block right down to where the code is generated in the C++.

Both the model and transformed block are in the same log_prob call with propto = true, yes. However, the actual output for the transformed parameters is calculated in write_array, which does not use auto diff.

So in your above case, the target would be updated correctly during sampling/VI, but the output of lp1 in the CSV would be all zeros.

A weird hack to fix this would be to always generate the calls with _lpdf in write_array only, though that seems like a bad workaround.