Closed dhardy closed 1 week ago
You cannot use the continuous test, because the cdf is not continuous. It does work, if you replace
test_continuous(seed as u64, dist, |k| cdf(k, n, x));
with
test_discrete(seed as u64, dist, |k| cdf(k as f64, n, x));
This should be mostly ready, but do we want to make this change?
Also note: #1517 added a note about casting results to ints being safe as a result of the input bounding the output; the input can now be larger too.
I am not sure if we should do it. I do not have some quantitative proof for it, but I doubt that for values bigger u64::MAX
there is a measurable difference to the Zeta
distribution. So if such big values are desired there is already a solution, albeit less elegant because the user has to do the case distinction.
I guess most users will use Zipf with smaller integers, but to be honest I never really had a need myself, so this is a very vague guess.
Another point might be the name. If you search for Zipf, you fill find mostly the Zeta distribution, scipy has our Zipf as Zipfian. Maybe this would be a better name.
Edit: I guess there is still a difference to Zeta, because even for 2**64
there is significant mass in the tail of the harmonic series. And also Zeta does not support s=1
. So there is definitely a potential usecase for this.
I'll wait for @vks to comment.
Looks good, but I'm a bit confused why there is a test failure.
Which test failure?
Do you have an opinion on the name? Zipf vs Zipfian
CHANGELOG.md
entrySummary
Change the parameter type of Zipf's
n
toF
Motivation
https://github.com/rust-random/rand/issues/1323#issuecomment-2125324653
Details
The CDF test fails:
The value stability tests (only 4 samples; bottom of
zipf.rs
) did not fail.