stan-dev / cmdstan

CmdStan, the command line interface to Stan
https://mc-stan.org/users/interfaces/cmdstan
BSD 3-Clause "New" or "Revised" License
210 stars 93 forks source link

cmdstanr 2.34.1 breaks brms threading for ZINB intercept model #1258

Closed epkanol closed 5 months ago

epkanol commented 5 months ago

Summary:

Version 2.34.1 generates syntactically incorrect stan code in case threads=threading(4) is used for a Zero-Inflated Negative Binomial distribution, using plain intercept model. The same setup works in 2.33.1, so this is a regression, in either cmdstanr or brms

Description:

Using a zero-inflated NB hierarchical model (nested groups), with 4 chains, 2 cores and 4 threads. PPC step fails with syntax error

Syntax error in '/tmp/RtmpbaCJJC/model_ffa4bdac52535e7613e8103d97ae09c5.stan', line 101, column 5 to column 6, parsing error:
   -------------------------------------------------
    99:     *   an integer sequence from start to end
   100:     */
   101:    int[] sequence(int start, int end) {
              ^
   102:      array[end - start + 1] int seq;
   103:      for (n in 1:num_elements(seq)) {
   -------------------------------------------------

Reproducible Steps:

Nothing in particular important in the data, afaik: sample.csv

library(brms)
d <- read.csv("sample.csv")

formula <- bf(y ~ 1 + (1 | team) + (1 | team:repo),
              zi ~ 1 + (1 | team) + (1 | team:repo))
priors <- c(prior(normal(0, 0.5), class = Intercept),
            prior(weibull(2, .25), class = sd),
            prior(normal(0, 0.5), class = Intercept, dpar=zi),
            prior(weibull(2, 0.25), class = sd, dpar=zi),
            prior(gamma(0.1, 0.1), class = shape))
validate_prior(prior=priors,
               formula=formula,
               data=d,                                        
               family=zero_inflated_negbinomial)
ppc_model <- brm(data = d,
      family = zero_inflated_negbinomial,
      formula = formula,                                      
      prior = priors,
      warmup = 1000,
      iter  = 4000,
      chains = 4,     
      cores = 2,
      sample_prior = "only",
      backend="cmdstanr",
      threads = threading(4),
      save_pars = save_pars(all=FALSE),
      adapt_delta = 0.95)

Removing the threads = threading(4) parameter makes the code compile correctly.

Current Output:

Syntax error in '/tmp/RtmpbaCJJC/model_ffa4bdac52535e7613e8103d97ae09c5.stan', line 101, column 5 to column 6, parsing error:
   -------------------------------------------------
    99:     *   an integer sequence from start to end
   100:     */
   101:    int[] sequence(int start, int end) {
              ^
   102:      array[end - start + 1] int seq;
   103:      for (n in 1:num_elements(seq)) {
   -------------------------------------------------

Expected Output:

Compiling Stan program...
Start sampling

Additional Information:

Note that I do not know if this is a brms or cmdstanr bug - I just know that the code works on cmdstanr 2.33.1 and fails to compile on cmdstanr 2.34.1

Also attached renv.lock, as a json file: renv.lock.json

Current Version:

v2.34.1

WardBrian commented 5 months ago

This was fixed in https://github.com/paul-buerkner/brms/commit/3625d913a19ac8119c89e09e36097fdfd261b0a3, which does not appear to have made it into a BRMS release yet. You could try either using BRMS from github or a previous version of cmdstan

epkanol commented 5 months ago

Wow, that was fast, Brian :) Thanks, I'll just stick to 2.33.1 for now, then!

bob-carpenter commented 5 months ago

The issue is being caused by a deprecation of our old array syntax and then its removal in 2.34. It looks like brms hasn't caught up to 2.34. The syntax is now array[3, 4] int n rather than int x[3, 4] n for declarations and the same without sizes for function arguments and returns.