Closed WardBrian closed 9 months ago
This caused a couple models to change results in the CmdStan Performance Tests:
stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse
This model seems to intentionally be miss-specified? The settings we are comparing against yield less than 10 effective samples and Rhats of 1.7, 1.8, 1.4, and 1.2, and seem to be very highly seed dependent with the current RNG, so I'm not surprised? stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data
Unlike the above, this seems to sample fine (ESS ~1000, Rhat 1.0), but it also gets different answers depending on the seed with the current RNG.I don't know enough about the history of the project and how these models and settings were picked, but so far I've yet to encounter them flagging something that wasn't a false alarm.
Your results match up better actually than the current tests so idt this is an issue
FAIL: golds/stat_comp_benchmarks_benchmarks_low_dim_gauss_mix_collapse_low_dim_gauss_mix_collapse.gold param mu.1 not within (-0.301108939227 - -0.70400742885) / 0.620216676511786 < 0.3
FAIL: golds/stat_comp_benchmarks_benchmarks_low_dim_gauss_mix_collapse_low_dim_gauss_mix_collapse.gold param mu.2 not within (-0.24229668616 - 0.15116454648) / 0.635178602376801 < 0.3
FAIL: golds/stat_comp_benchmarks_benchmarks_low_dim_gauss_mix_collapse_low_dim_gauss_mix_collapse.gold param sigma.2 not within (1.033877479 - 1.096754202) / 0.154013580545209 < 0.3
'stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan' had fails '[('k.9', 68.023, 509.71930927251054, 350.409)]' and errors '[]'
'stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan' had fails '[('mu.1', -0.70400742885, 0.620216676511786, -0.301108939227), ('mu.2', 0.15116454648, 0.635178602376801, -0.24229668616), ('sigma.2', 1.096754202, 0.154013580545209, 1.033877479)]' and errors '[]'
mu[1] -2.8079e-01
mu[2] -2.6932e-01
sigma[1] 1.0254e+00
sigma[2] 1.0299e+00
theta 4.9645e-01
gen_gp_data is a weird one to fail but idt thats an issue
Submission Checklist
./runTests.py src/test/unit
make cpplint
Summary
Closes #3256. This switches the pRNG used by default in the services and tests to be
boost::mixmax
, as recommended by the boost maintainers: https://github.com/boostorg/random/issues/92Intended Effect
Resolve both https://github.com/stan-dev/stan/issues/3167 and https://github.com/stan-dev/cmdstan/issues/1241, avoid other issues with RNG quality.
How to Verify
Side Effects
Seeds used in previous versions will have different meanings from this version on. We should note this in the release notes.
Additionally, by changing the type alias of
rng_t
introduced in #3263, this PR changes the ABI of the model base, so code which calls it directly will need to be aware. This is most obviously relevant for things which use the standalone-functions feature of stanc like cmdstanr and rstan, I believe. @andrjohnsDocumentation
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Simons Foundation
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: