stan-dev / stanc3

The Stan transpiler (from Stan to C++ and beyond).
BSD 3-Clause "New" or "Revised" License
140 stars 44 forks source link

pedantic mode incorrectly warning about 2 priors? #746

Open jgabry opened 3 years ago

jgabry commented 3 years ago

@rybern Using the latest CmdStanR from github I was trying out pedantic mode (which is really cool!) and I got an unexpected warning:


stan_file <- write_stan_file(
"parameters {
  real x;
  real sigma;
}
model {
  x ~ normal(0, sigma);
  sigma ~ exponential(1);
}"
)
mod <- cmdstan_model(stan_file, stanc_options = list("warn-pedantic"=TRUE))
Compiling Stan program...
Warning:
  The parameter sigma has 2 priors.
Warning at '/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpwDnKa2/model-e66b730e6be7.stan', line 6, column 16 to column 21:
  A normal distribution is given parameter sigma as a scale parameter
  (argument 2), but sigma was not constrained to be strictly positive.
Warning at '/var/folders/h6/14xy_35x4wd2tz542dn0qhtc0000gn/T/RtmpwDnKa2/model-e66b730e6be7.stan', line 7, column 2 to column 7:
  Parameter sigma is given a exponential distribution, which has strictly
  positive support, but sigma was not constrained to be strictly positive.

The second two warnings make sense but the first one about sigma having 2 priors isn't correct. There's only one prior on sigma. Is this a bug?

rybern commented 3 years ago

Hi Jonah,

I agree the warning doesn't make much sense for this model. It's happening because x ~ normal(0, sigma); is identified as a prior on sigma. That's happening because:

  1. The analysis doesn't distinguish the position of the parameter in distribution, so x ~ normal(0, sigma); may as well be sigma ~ normal(0, x);.
  2. A "prior" is basically defined as anything not a "likelihood" factor, and "likelihood" is defined in terms of data variables. Unfortunately, data variables aren't always what the user considers the "data". In this case, you might be considering x the data - if it could be identified as data, then x ~ normal(0, sigma); would indeed be marked as a "likelihood".

This is clearly a confusing warning sometimes. Maybe we could:

jgabry commented 3 years ago

Interesting, thanks for the explanation! That makes sense. And indeed you're right that if I move x to the data block I don't get that warning anymore.

What about only considering something as "having a prior" if it's in the parameters block and (appears on the LHS of a statement with ~ OR appears to the left of | in a target += statement)? In my example that would only count 1 prior for sigma but I'm probably overlooking a reason why that wouldn't be sufficient (I haven't really thought it through) or that might be too different than how the analysis currently works.

jgabry commented 3 years ago

Also, I definitely picked a strange model in that it has no data, so I'm not super concerned about this, I just thought it was strange. But your explanation makes sense.

rybern commented 3 years ago

I see what you mean. That's actually already implemented in a separate warning. If you run:

parameters {
  real sigma;
}
model {
  sigma ~ exponential(1);
  sigma ~ exponential(1);
}

you get:

Warning at 'twiddles.stan', line 5, column 2 to column 25:
  The parameter sigma is on the left-hand side of more than one twiddle
  statement.

(Looking back, I don't like that language much!) The reason why I hesitate to use only that one is that Stan is so much more flexible than its builtin distributions.

One improvement would be to rule out a factor as a prior for a parameter if it's a builtin distribution and the parameter is the right of the |. That would solve this scenario without limiting scenarios with non-builtin distributions.

Another improvement might be to consider the twiddle analysis when writing the error messages, e.g.: if there were multiple "priors" but not multiple LHS "twiddles", add some language about how the analysis might be counter-intuitive when the data is misidentified.

jgabry commented 3 years ago

The reason why I hesitate to use only that one is that Stan is so much more flexible than its builtin distributions.

Yeah that's a good point. That said, I would bet that most users turning on pedantic mode will be using builtin distributions most of the time. But you're totally right that Stan is much more flexible than that and many users don't limit themselves to just the builtin distributions.

One improvement would be to rule out a factor as a prior for a parameter if it's a builtin distribution and the parameter is the right of the |. That would solve this scenario without limiting scenarios with non-builtin distributions.

Another improvement might be to consider the twiddle analysis when writing the error messages, e.g.: if there were multiple "priors" but not multiple LHS "twiddles", add some language about how the analysis might be counter-intuitive when the data is misidentified.

Good ideas. Not super urgent or anything like that, but if those aren't too hard to implement then I think they would be worth it.