Change default RNG to boost::mixmax

WardBrian commented 9 months ago

Submission Checklist

[x] Run unit tests: ./runTests.py src/test/unit
[x] Run cpplint: make cpplint
[x] Declare copyright holder and open-source license: see below

Summary

Closes #3256. This switches the pRNG used by default in the services and tests to be boost::mixmax, as recommended by the boost maintainers: https://github.com/boostorg/random/issues/92

Intended Effect

Resolve both https://github.com/stan-dev/stan/issues/3167 and https://github.com/stan-dev/cmdstan/issues/1241, avoid other issues with RNG quality.

How to Verify

Side Effects

Seeds used in previous versions will have different meanings from this version on. We should note this in the release notes.

Additionally, by changing the type alias of rng_t introduced in #3263, this PR changes the ABI of the model base, so code which calls it directly will need to be aware. This is most obviously relevant for things which use the standalone-functions feature of stanc like cmdstanr and rstan, I believe. @andrjohns

Documentation

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Simons Foundation

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

WardBrian commented 9 months ago

This caused a couple models to change results in the CmdStan Performance Tests:

stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse This model seems to intentionally be miss-specified? The settings we are comparing against yield less than 10 effective samples and Rhats of 1.7, 1.8, 1.4, and 1.2, and seem to be very highly seed dependent with the current RNG, so I'm not surprised?
stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data Unlike the above, this seems to sample fine (ESS ~1000, Rhat 1.0), but it also gets different answers depending on the seed with the current RNG.

I don't know enough about the history of the project and how these models and settings were picked, but so far I've yet to encounter them flagging something that wasn't a false alarm.

SteveBronder commented 9 months ago

Your results match up better actually than the current tests so idt this is an issue

FAIL: golds/stat_comp_benchmarks_benchmarks_low_dim_gauss_mix_collapse_low_dim_gauss_mix_collapse.gold param mu.1 not within (-0.301108939227 - -0.70400742885) / 0.620216676511786 < 0.3
FAIL: golds/stat_comp_benchmarks_benchmarks_low_dim_gauss_mix_collapse_low_dim_gauss_mix_collapse.gold param mu.2 not within (-0.24229668616 - 0.15116454648) / 0.635178602376801 < 0.3
FAIL: golds/stat_comp_benchmarks_benchmarks_low_dim_gauss_mix_collapse_low_dim_gauss_mix_collapse.gold param sigma.2 not within (1.033877479 - 1.096754202) / 0.154013580545209 < 0.3
'stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan' had fails '[('k.9', 68.023, 509.71930927251054, 350.409)]' and errors '[]'
'stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan' had fails '[('mu.1', -0.70400742885, 0.620216676511786, -0.301108939227), ('mu.2', 0.15116454648, 0.635178602376801, -0.24229668616), ('sigma.2', 1.096754202, 0.154013580545209, 1.033877479)]' and errors '[]'

mu[1] -2.8079e-01
mu[2] -2.6932e-01
sigma[1] 1.0254e+00
sigma[2] 1.0299e+00
theta 4.9645e-01

gen_gp_data is a weird one to fail but idt thats an issue

stan-dev / stan