Closed vigna closed 4 years ago
I think both formulae are wrong? If you take a corner case like nL = P
the probability of no overlap, there are P × n! / n = L · n!
states with no overlap of P^n
states total.
For the first unused value, there are n+1
positions it could lie; for the second, there are n+2
etc. (x! / n!
) — but now with (x-n)!
combinations of the x = P - nL
unused values, i.e.
prob(no overlap) = (L · n! / P^n) · x! / (n! · (x-n)!)
= L · x! / ((x-n)! · P^n)
= L · (P - nL)! / (P - n(L + 1))! · P^-n
Where n
is comparatively small, this is approximately:
L · (P - nL) ^ n · P^-n
= L · (1 - nL / P)^n
~Aha, this is approximately your formula!~
Since both formulae are approximated, technically they're both wrong.
Since your approximation is very far from reality (technically: it is asymptotically wrong), probably it would be better to replace it with an approximation closer to reality (technically: one that is asymptotically right).
Does that sound better?
Note that your "formula" L · (1 - nL / P)^n gives values much greater than 1 even for n = 2 (try L=1000, P=1000000).
The formula reported in the book for the probability of overlap:
is slightly off, the actual (approximated) formula is
The off-by-ones are not so relevant, but the spurious factor 2 halves the actual probability.
I also think it would be nice to tell the user that if un^2 is much smaller than p, the formula is very well approximated by
un^2 / p
(see http://prng.di.unimi.it/#remarks) as it is much easier to understand.