stan-dev / math

The Stan Math Library is a C++ template library for automatic differentiation of any order using forward, reverse, and mixed modes. It includes a range of built-in functions for probabilistic modeling, linear algebra, and equation solving.
https://mc-stan.org
BSD 3-Clause "New" or "Revised" License
733 stars 183 forks source link

Odd interaction between `ode_adams` and threading #2975

Open WardBrian opened 9 months ago

WardBrian commented 9 months ago

Description

I have noticed some odd behavior using ode_adams and STAN_THREADS. This does not recur when using ode_bdf or ode_rk45.

Example

Using a simple SIR model, courtesy of @charlesm93:


functions {
  vector sir(real t, vector y, real beta, real gamma, int N) {
    real S = y[1];
    real I = y[2];
    // real R = y[3];

    real dS_dt = - beta * I * S / N;
    real dI_dt = beta * I * S / N - gamma * I;
    real dR_dt = gamma * I;

    return [dS_dt, dI_dt, dR_dt]';
  }

}

data {
  int<lower=1> n_days;
  vector[3] y0;
  real t0;
  array[n_days] real ts;
  int N;
  array[n_days] int cases;
}

parameters {
  real<lower=0> gamma;
  real<lower=0> beta;
  real<lower=0> phi_inv;
}

transformed parameters {
  real phi = 1. / phi_inv;
  array[n_days] vector[3] y = ode_adams(sir, y0, t0, ts, beta, gamma, N);
}

model {
  //priors
  beta ~ normal(2, 1);
  gamma ~ normal(0.4, 0.5);
  phi_inv ~ exponential(5);

  cases ~ neg_binomial_2(y[,2], phi);
  // cases ~ poisson(y[,2]);
}

generated quantities {
  real R0 = beta / gamma;
  real recovery_time = 1 / gamma;
  array[n_days] real pred_cases = neg_binomial_2_rng(y[,2], phi);
  // array[n_days] real pred_cases = poisson_rng(y[,2]);
}

data:

{
    "N": 763,
    "cases": [3, 8, 26, 76, 225, 298, 258, 233, 189, 128, 68, 29, 14, 4],
    "n_days": 14,
    "ts": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14],
    "t0": 0,
    "y0": [762, 1, 0]
}

inits

{
    "beta": 1.00247,
    "gamma": 1.15014,
    "phi_inv": 0.008622444
}

When compiled via cmdstan with STAN_THREADS=1, I then run with

./sir sample num_chains=4 data file=sir.data.json init=sir.init.json num_threads=4

It seems to vary a bit between runs, but when run repeatedly the most common outcome is that two chains run/print their progress, until completion, and then the process hangs. I have recreated this on two separate linux systems.

strace reports a lot of using of sched_yield from tbb, but otherwise there is nothing happening. gdb seems to implicate set_zero_all_adjoints in a backtrace, but not in any way I can track down.

Expected Output

Expect all four chains to run simultaneously and the program to not hang.

Current Version:

v4.7.0

WardBrian commented 9 months ago

Here's what gdb reports when chain 1 and 4 are stuck:

Details

``` (gdb) info threads Id Target Id Frame * 1 Thread 0x7f54f8ea9740 (LWP 13426) "sir" 0x000055c3a6af4708 in cvStep () 2 Thread 0x7f54f81fb640 (LWP 13427) "sir" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 3 Thread 0x7f54f7dfa640 (LWP 13428) "sir" syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38 4 Thread 0x7f54f79f9640 (LWP 13429) "sir" 0x000055c3a694c420 in operator new(unsigned long)@plt () (gdb) bt #0 0x000055c3a6af4708 in cvStep () #1 0x000055c3a6afca60 in CVode () #2 0x000055c3a6984d9e in stan::math::cvodes_integrator<1, sir_model_namespace::sir_variadic2_functor__, Eigen::Map, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value const&, stan::math::var_value const&, int const&>::operator() (this=this@entry=0x7ffd46151200) at stan/lib/stan_math/stan/math/rev/functor/cvodes_integrator.hpp:309 #3 0x000055c3a696a52b in _ZZN4stan4math18ode_adams_tol_implIN19sir_model_namespace23sir_variadic2_functor__EN5Eigen3MapINS4_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS4_6StrideILi0ELi0EEEEEddJNS0_9var_valueIdvEESC_iELPv0EEESt6vectorINS6_INS_11return_typeIJT0_T1_T2_DpT3_EE4typeELin1ELi1ELi0ELin1ELi1EEESaISN_EEPKcRKT_RKSG_RKSH_RKSE_ISI_SaISI_EEddlPSoDpRKSJ_ENKUlDpRKT_E_clIJSC_SC_iEEEDaS1A_ (__closure=) at stan/lib/stan_math/stan/math/rev/functor/ode_adams.hpp:64 #4 _ZN4stan4math8internal10apply_implIZNS0_18ode_adams_tol_implIN19sir_model_namespace23sir_variadic2_functor__EN5Eigen3MapINS6_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS6_6StrideILi0ELi0EEEEEddJNS0_9var_valueIdvEESE_iELPv0EEESt6vectorINS8_INS_11return_typeIJT0_T1_T2_DpT3_EE4typeELin1ELi1ELi0ELin1ELi1EEESaISP_EEPKcRKT_RKSI_RKSJ_RKSG_ISK_SaISK_EEddlPSoDpRKSL_EUlDpRKT_E_RKSt5tupleIJSE_SE_iEEJLm0ELm1ELm2EEEEDcOSU_OSI_St16integer_sequenceImJXspT1_EEE (i=..., t=..., f=...) at stan/lib/stan_math/stan/math/prim/functor/apply.hpp:26 #5 _ZN4stan4math5applyIZNS0_18ode_adams_tol_implIN19sir_model_namespace23sir_variadic2_functor__EN5Eigen3MapINS5_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS5_6StrideILi0ELi0EEEEEddJNS0_9var_valueIdvEESD_iELPv0EEESt6vectorINS7_INS_11return_typeIJT0_T1_T2_DpT3_EE4typeELin1ELi1ELi0ELin1ELi1EEESaISO_EEPKcRKT_RKSH_RKSI_RKSF_ISJ_SaISJ_EEddlPSoDpRKSK_EUlDpRKT_E_RKSt5tupleIJSD_SD_iEEEEDcOST_OSH_ (t=..., f=...) at stan/lib/stan_math/stan/math/prim/functor/apply.hpp:47 #6 stan::math::ode_adams_tol_impl, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value, stan::math::var_value, int, (void*)0> (msgs=, max_num_steps=, absolute_tolerance=, relative_tolerance=, ts=..., t0=, y0=..., f=..., function_name=) at stan/lib/stan_math/stan/math/rev/functor/ode_adams.hpp:66 #7 stan::math::ode_adams, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value, stan::math::var_value, int, (void*)0> (f=..., y0=..., t0=, ts=..., msgs=) at stan/lib/stan_math/stan/math/rev/functor/ode_adams.hpp:157 #8 0x000055c3a698abc2 in sir_model_namespace::sir_model::log_prob_impl, -1, 1, 0, -1, 1>, Eigen::Matrix, (void*)0, (void*)0, (void*)0> (this=0x55c3a85446f0, params_r__=..., params_i__=..., pstream__=0x7ffd46151600) at sir.hpp:327 #9 0x000055c3a698b034 in sir_model_namespace::sir_model::log_prob > (pstream=, params_r=..., this=) at stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/PlainObjectBase.h:968 #10 stan::model::model_base_crtp::log_prob_propto_jacobian (this=, theta=..., msgs=) at stan/src/stan/model/model_base_crtp.hpp:132 #11 0x000055c3a6a661b8 in stan::model::model_base::log_prob > (msgs=, params_r=..., this=) at stan/src/stan/model/model_base.hpp:326 #12 stan::model::model_functional::operator() > (x=..., this=0x7ffd461515a0) at stan/src/stan/model/model_functional.hpp:21 #13 stan::math::gradient > (f=..., x=..., fx=@0x55c3a854fd00: 7409.7008271914856, grad_fx=...) at stan/lib/stan_math/stan/math/rev/functor/gradient.hpp:51 #14 0x000055c3a6a66703 in stan::model::gradient (model=..., x=..., f=@0x55c3a854fd00: 7409.7008271914856, grad_f=..., logger=...) at stan/src/stan/model/gradient.hpp:27 #15 0x000055c3a6a6d977 in stan::mcmc::base_hamiltonian, boost::random::linear_congruential_engine > >::update_potential_gradient (this=this@entry=0x55c3a854fd20, z=..., logger=...) at stan/src/stan/mcmc/hmc/hamiltonians/base_hamiltonian.hpp:63 #16 0x000055c3a6a6e031 in stan::mcmc::expl_leapfrog, boost::random::linear_congruential_engine > > >::update_q (this=, z=..., hamiltonian=warning: RTTI symbol not found for class 'stan::mcmc::diag_e_metric, boost::random::linear_congruential_engine > >' ..., epsilon=, logger=...) at stan/src/stan/mcmc/hmc/integrators/expl_leapfrog.hpp:25 #17 0x000055c3a69bb16f in stan::mcmc::base_leapfrog, boost::random::linear_congruential_engine > > >::evolve (logger=..., epsilon=0.125, hamiltonian=warning: RTTI symbol not found for class 'stan::mcmc::diag_e_metric, boost::random::linear_congruential_engine > >' ..., z=..., this=0x55c3a854fd18) at stan/src/stan/mcmc/hmc/integrators/base_leapfrog.hpp:20 #18 stan::mcmc::base_hmc, boost::random::linear_congruential_engine > >::init_stepsize (logger=..., this=0x55c3a854fcc0) at stan/src/stan/mcmc/hmc/base_hmc.hpp:125 #19 stan::services::util::run_adaptive_sampler, boost::random::linear_congruential_engine > >, stan::model::model_base, boost::random::additive_combine_engine, boost::random::linear_congruential_engine > > (sampler=warning: RTTI symbol not found for class 'stan::mcmc::adapt_diag_e_nuts, boost::random::linear_congruential_engine > >' ..., model=..., num_warmup=1000, num_samples=1000, num_thin=1, refresh=100, save_warmup=false, rng=..., interrupt=..., logger=..., sample_writer=..., diagnostic_writer=..., metric_writer=..., chain_id=1, num_chains=4, cont_vector=..., cont_vector=...) at stan/src/stan/services/util/run_adaptive_sampler.hpp:63 #20 0x000055c3a6a6f6f0 in stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}::operator()(tbb::blocked_range const&) const (r=..., __closure=) at stan/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp:391 #21 tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr for more, q to quit, c to continue without paging-- ::dump, std::default_delete >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/parallel_for.h:115 #22 tbb::interface9::internal::simple_partition_type::execute, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>, tbb::blocked_range >(stan::model::model_base&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>' ..., this=0x7f54f87e3cd0) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/partitioner.h:510 #23 tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>::execute() (this=0x7f54f87e3c40) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/parallel_for.h:142 #24 0x00007f54f8f13755 in tbb::internal::custom_scheduler::process_bypass_loop (this=this@entry=0x7f54f87ee600, context_guard=..., t=0x7f54f87e3c40, isolation=isolation@entry=0) at ../tbb_2020.3/src/tbb/custom_scheduler.h:474 #25 0x00007f54f8f13aa2 in tbb::internal::custom_scheduler::local_wait_for_all (this=0x7f54f87ee600, parent=..., child=) at ../tbb_2020.3/src/tbb/custom_scheduler.h:636 #26 0x00007f54f8f113a7 in tbb::internal::generic_scheduler::local_spawn_root_and_wait (this=0x7f54f87ee600, first=0x7f54f87e3c40, next=@0x7f54f87e3c38: 0x0) at ../tbb_2020.3/src/tbb/scheduler.cpp:738 #27 0x000055c3a6a72ebe in tbb::task::spawn_root_and_wait (root=...) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/task.h:809 #28 tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}--Type for more, q to quit, c to continue without paging-- , tbb::simple_partitioner const>::run(tbb::blocked_range const&, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1} const&, tbb::simple_partitioner const&) (partitioner=..., body=..., range=...) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/parallel_for.h:95 #29 tbb::parallel_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}>(tbb::blocked_range const&, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1} const&, tbb::simple_partitioner const&) ( partitioner=..., body=..., range=...) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/parallel_for.h:208 #30 stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > > (model=..., num_chains=num_chains@entry=4, init=std::vector of length 4, capacity 4 = {...}, init_inv_metric=std::vector of length 4, capacity 4 = {...}, random_seed=random_seed@entry=4184322767, init_chain_id=init_chain_id@entry=1, init_radius=init_radius@entry=2, num_warmup=1000, num_samples=1000, num_thin=1, save_warmup=false, refresh=100, stepsize=stepsize@entry=1, stepsize_jitter=stepsize_jitter@entry=0, max_depth=10, delta=delta@entry=0.80000000000000004, gamma=gamma@entry=0.050000000000000003, kappa=kappa@entry=0.75, t0=t0@entry=10, init_buffer=75, term_buffer=50, window=25, interrupt=..., logger=..., init_writer=std::vector of length 4, capacity 4 = {...}, sample_writer=std::vector of length 4, capacity 4 = {...}, diagnostic_writer=std::vector of length 4, capacity 4 = {...}, metric_writer=std::vector of length 4, capacity 4 = {...}) at stan/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp:384 #31 0x000055c3a6a73c04 in stan::services::sample::hmc_nuts_diag_e_adapt, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > > (model=..., num_chains=num_chains@entry=4, init=std::vector of length 4, capacity 4 = {...}, random_seed=random_seed@entry=4184322767, init_chain_id=init_chain_id@entry=1, init_radius=init_radius@entry=2, num_warmup=num_warmup@entry=1000, num_samples=1000, num_thin=1, save_warmup=false, refresh=100, stepsize=stepsize@entry=1, stepsize_jitter=stepsize_jitter@entry=0, max_depth=10, delta=delta@entry=0.80000000000000004, gamma=gamma@entry=0.050000000000000003, kappa=kappa@entry=0.75, t0=t0@entry=10, init_buffer=75, term_buffer=50, window=25, interrupt=..., logger=..., init_writer=std::vector of length 4, capacity 4 = {...}, sample_writer=std::vector of length 4, capacity 4 = {...}, diagnostic_writer=std::vector of length 4, capacity 4 = {...}, metric_writer=std::vector of length 4, capacity 4 = {...}) at stan/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp:560 #32 0x000055c3a69d3dd0 in cmdstan::command (argc=, argv=) at src/cmdstan/command.hpp:654 #33 0x000055c3a696727a in main (argc=, argv=) at src/cmdstan/main.cpp:6 (gdb) thread 4 [Switching to thread 4 (Thread 0x7f54f79f9640 (LWP 13429))] #0 0x000055c3a694c420 in operator new(unsigned long)@plt () (gdb) bt #0 0x000055c3a694c420 in operator new(unsigned long)@plt () #1 0x000055c3a69959ab in __gnu_cxx::new_allocator::allocate (this=0x7f54f79f7c90, __n=9) at /usr/include/c++/11/ext/new_allocator.h:103 #2 std::allocator_traits >::allocate (__n=9, __a=...) at /usr/include/c++/11/bits/alloc_traits.h:464 #3 std::_Vector_base >::_M_allocate (__n=9, this=0x7f54f79f7c90) at /usr/include/c++/11/bits/stl_vector.h:346 #4 std::_Vector_base >::_M_create_storage (__n=9, this=0x7f54f79f7c90) at /usr/include/c++/11/bits/stl_vector.h:361 #5 std::_Vector_base >::_Vector_base (__a=..., __n=9, this=0x7f54f79f7c90) at /usr/include/c++/11/bits/stl_vector.h:305 #6 std::vector >::vector (__a=..., __n=9, this=0x7f54f79f7c90) at /usr/include/c++/11/bits/stl_vector.h:511 #7 stan::math::cvodes_integrator<1, sir_model_namespace::sir_variadic2_functor__, Eigen::Map, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value const&, stan::math::var_value const&, int const&>::rhs_sens (ySdot=0x7f54ec0167b0, yS=0x7f54ec014e60, y=0x7f54ec0124e0, t=5.7446523995551716e-24, this=0x7f54f79f8180) at stan/lib/stan_math/stan/math/rev/functor/cvodes_integrator.hpp:149 #8 stan::math::cvodes_integrator<1, sir_model_namespace::sir_variadic2_functor__, Eigen::Map, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value const&, stan::math::var_value const&, int const&>::cv_rhs_sens (Ns=, t=5.7446523995551716e-24, y=, ydot=, yS=0x7f54ec014e60, ySdot=0x7f54ec0167b0, user_data=0x7f54f79f8180, tmp1=0x7f54ec01c4c0, tmp2=0x7f54ec012cb0) at stan/lib/stan_math/stan/math/rev/functor/cvodes_integrator.hpp:83 #9 0x000055c3a6af35e3 in cvSensRhsWrapper () #10 0x000055c3a6b0afa6 in cvNlsResidualSensStg () #11 0x000055c3a6b101ff in SUNNonlinSolSolve_Newton () #12 0x000055c3a6af461c in cvStep () #13 0x000055c3a6afca60 in CVode () #14 0x000055c3a6984d9e in stan::math::cvodes_integrator<1, sir_model_namespace::sir_variadic2_functor__, Eigen::Map, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value const&, stan::math::var_value const&, int const&>::operator() (this=this@entry=0x7f54f79f8180) at stan/lib/stan_math/stan/math/rev/functor/cvodes_integrator.hpp:309 #15 0x000055c3a696a52b in _ZZN4stan4math18ode_adams_tol_implIN19sir_model_namespace23sir_variadic2_functor__EN5Eigen3MapINS4_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS4_6StrideILi0ELi0EEEEEddJNS0_9var_valueIdvEESC_iELPv0EEESt6vectorINS6_INS_11return_typeIJT0_T1_T2_DpT3_EE4typeELin1ELi1ELi0ELin1ELi1EEESaISN_EEPKcRKT_RKSG_RKSH_RKSE_ISI_SaISI_EEddlPSoDpRKSJ_ENKUlDpRKT_E_clIJSC_SC_iEEEDaS1A_ (__closure=) at stan/lib/stan_math/stan/math/rev/functor/ode_adams.hpp:64 #16 _ZN4stan4math8internal10apply_implIZNS0_18ode_adams_tol_implIN19sir_model_namespace23sir_variadic2_functor__EN5Eigen3MapINS6_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS6_6StrideILi0ELi0EEEEEddJNS0_9var_valueIdvEESE_iELPv0EEESt6vectorINS8_INS_11return_typeIJT0_T1_T2_DpT3_EE4typeELin1ELi1ELi0ELin1ELi1EEESaISP_EEPKcRKT_RKSI_RKSJ_RKSG_ISK_SaISK_EEddlPSoDpRKSL_EUlDpRKT_E_RKSt5tupleIJSE_SE_iEEJLm0ELm1ELm2EEEEDcOSU_OSI_St16integer_sequenceImJXspT1_EEE (i=..., t=..., f=...) at stan/lib/stan_math/stan/math/prim/functor/apply.hpp:26 #17 _ZN4stan4math5applyIZNS0_18ode_adams_tol_implIN19sir_model_namespace23sir_variadic2_functor__EN5Eigen3MapINS5_6MatrixIdLin1ELi1ELi0ELin1ELi1EEELi0ENS5_6StrideILi0ELi0EEEEEddJNS0_9var_valueIdvEESD_iELPv0EEESt6vectorINS7_INS_11return_typeIJT0_T1_T2_DpT3_EE4typeELin1ELi1ELi0ELin1ELi1EEESaISO_EEPKcRKT_RKSH_RKSI_RKSF_ISJ_SaISJ_EEddlPSoDpRKSK_EUlDpRKT_E_RKSt5tupleIJSD_SD_iEEEEDcOST_OSH_ (t=..., f=...) at stan/lib/stan_math/stan/math/prim/functor/apply.hpp:47 #18 stan::math::ode_adams_tol_impl, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value, stan::math::var_value, int, (void*)0> (msgs=, max_num_steps=, absolute_tolerance=, relative_tolerance=, ts=..., t0=, y0=..., f=..., function_name=) at stan/lib/stan_math/stan/math/rev/functor/ode_adams.hpp:66 #19 stan::math::ode_adams, 0, Eigen::Stride<0, 0> >, double, double, stan::math::var_value, stan::math::var_value, int, (void*)0> (f=..., y0=..., t0=, ts=..., msgs=) at stan/lib/stan_math/stan/math/rev/functor/ode_adams.hpp:157 #20 0x000055c3a698abc2 in sir_model_namespace::sir_model::log_prob_impl, -1, 1, 0, -1, 1>, Eigen::Matrix, (void*)0, (void*)0, (void*)0> (this=0x55c3a85446f0, params_r__=..., params_i__=..., pstream__=0x7f54f79f8580) at sir.hpp:327 #21 0x000055c3a698b034 in sir_model_namespace::sir_model::log_prob > (pstream=, params_r=..., this=) at stan/lib/stan_math/lib/eigen_3.4.0/Eigen/src/Core/PlainObjectBase.h:968 #22 stan::model::model_base_crtp::log_prob_propto_jacobian (this=, theta=..., msgs=) at stan/src/stan/model/model_base_crtp.hpp:132 #23 0x000055c3a6a661b8 in stan::model::model_base::log_prob > (msgs=, params_r=..., this=) at stan/src/stan/model/model_base.hpp:326 #24 stan::model::model_functional::operator() > (x=..., this=0x7f54f79f8520) at stan/src/stan/model/model_functional.hpp:21 #25 stan::math::gradient > (f=..., x=..., fx=@0x55c3a8550198: 7409.7008271914856, grad_fx=...) at stan/lib/stan_math/stan/math/rev/functor/gradient.hpp:51 #26 0x000055c3a6a66703 in stan::model::gradient (model=..., x=..., f=@0x55c3a8550198: 7409.7008271914856, grad_f=..., logger=...) at stan/src/stan/model/gradient.hpp:27 #27 0x000055c3a6a6d977 in stan::mcmc::base_hamiltonian, boost::random::linear_congruential_engine > >::update_potential_gradient (this=this@entry=0x55c3a85501b8, z=..., logger=...) at stan/src/stan/mcmc/hmc/hamiltonians/base_hamiltonian.hpp:63 #28 0x000055c3a6a6e031 in stan::mcmc::expl_leapfrog, boost::random::linear_congruential_engine > > >::update_q (this=, z=..., hamiltonian=warning: RTTI symbol not found for class 'stan::mcmc::diag_e_metric, boost::random::linear_congruential_engine > >' ..., epsilon=, logger=...) at stan/src/stan/mcmc/hmc/integrators/expl_leapfrog.hpp:25 #29 0x000055c3a69bb16f in stan::mcmc::base_leapfrog, boost::random::linear_congruential_engine > > >::evolve (logger=..., epsilon=0.125, hamiltonian=warning: RTTI symbol not found for class 'stan::mcmc::diag_e_metric, boost::random::linear_congruential_engine > >' ..., z=..., this=0x55c3a85501b0) at stan/src/stan/mcmc/hmc/integrators/base_leapfrog.hpp:20 #30 stan::mcmc::base_hmc, boost::random::linear_congruential_engine > >::init_stepsize (logger=..., this=0x55c3a8550158) at stan/src/stan/mcmc/hmc/base_hmc.hpp:125 #31 stan::services::util::run_adaptive_sampler, boost::random::linear_congruential_engine > >, stan::model::model_base, boost::random::additive_combine_engine, boost::random::linear_congruential_engine > > (sampler=warning: RTTI symbol not found for class 'stan::mcmc::adapt_diag_e_nuts, boost::random::linear_congruential_engine > >' ..., model=..., num_warmup=1000, num_samples=1000, num_thin=1, refresh=100, --Type for more, q to quit, c to continue without paging-- save_warmup=false, rng=..., interrupt=..., logger=..., sample_writer=..., diagnostic_writer=..., metric_writer=..., chain_id=4, num_chains=4, cont_vector=..., cont_vector=...) at stan/src/stan/services/util/run_adaptive_sampler.hpp:63 #32 0x000055c3a6a6f6f0 in stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}::operator()(tbb::blocked_range const&) const (r=..., __closure=) at stan/src/stan/services/sample/hmc_nuts_diag_e_adapt.hpp:391 #33 tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>::run_body(tbb::blocked_range&) (r=..., this=) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/parallel_for.h:115 #34 tbb::interface9::internal::simple_partition_type::execute, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>, tbb::blocked_range >(stan::model::model_base&, tbb::blocked_range&) (range=..., start=warning: RTTI symbol not found for class 'tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>' ..., this=0x7f54f87d7dd0) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/partitioner.h:510 #35 tbb::interface9::internal::start_for, stan::services::sample::hmc_nuts_diag_e_adapt, std::unique_ptr >, stan::callbacks::writer, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::unique_stream_writer >, std::default_delete > > >, stan::callbacks::json_writer >, std::default_delete > > > >(stan::model::model_base&, unsigned long, std::vector, std::allocator > > const&, std::vector >, std::allocator > > > const&, unsigned int, unsigned int, double, int, int, int, bool, int, double, double, int, double, double, double, double, unsigned int, unsigned int, unsigned int, stan::callbacks::interrupt&, stan::callbacks::logger&, std::vector >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&, std::vector >, std::default_delete > > >, std::allocator >, std::default_delete > > > > >&)::{lambda(tbb::blocked_range const&)#1}, tbb::simple_partitioner const>::execute() (this=0x7f54f87d7d40) at stan/lib/stan_math/lib/tbb_2020.3/include/tbb/parallel_for.h:142 #36 0x00007f54f8f13755 in tbb::internal::custom_scheduler::process_bypass_loop (this=this@entry=0x7f54f87c3e00, context_guard=..., t=0x7f54f87d7d40, isolation=isolation@entry=0) at ../tbb_2020.3/src/tbb/custom_scheduler.h:474 --Type for more, q to quit, c to continue without paging-- #37 0x00007f54f8f13aa2 in tbb::internal::custom_scheduler::local_wait_for_all (this=0x7f54f87c3e00, parent=..., child=) at ../tbb_2020.3/src/tbb/custom_scheduler.h:636 #38 0x00007f54f8f0d20c in tbb::internal::arena::process (this=0x7f54f87e7480, s=...) at ../tbb_2020.3/src/tbb/arena.cpp:196 #39 0x00007f54f8f0b8d8 in tbb::internal::market::process (this=0x7f54f87f3580, j=...) at ../tbb_2020.3/src/tbb/market.cpp:667 #40 0x00007f54f8f07c40 in tbb::internal::rml::private_worker::run (this=0x7f54f8507000) at ../tbb_2020.3/src/tbb/private_server.cpp:266 #41 0x00007f54f8f07ead in tbb::internal::rml::private_worker::thread_routine (arg=) at ../tbb_2020.3/src/tbb/private_server.cpp:219 #42 0x00007f54f8894ac3 in start_thread (arg=) at ./nptl/pthread_create.c:442 #43 0x00007f54f8926a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 ```

syclik commented 9 months ago

I'll try to reproduce when I have some free time.

syclik commented 9 months ago

I'm on develop for cmdstan and I haven't seen it yet (maybe a dozen different seeds). Can you throw out a seed that causes it to fail if you have one?

WardBrian commented 9 months ago

This was develop on both machines I was using, independent of seed. Both were running Linux however