CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.67k stars 7.87k forks source link

Chapter 4: Notable Failures of the Law of Large Numbers #81

Open xcthulhu opened 11 years ago

xcthulhu commented 11 years ago

While the Law of Large Numbers is a profoundly important result, I like many statisticians have come to doubt its universal applicability.

This Law holds for any distribution, minus some pathological examples that only mathematicians have fun with.

I take issue with this assertion. Two important failures off the top of my head:

  1. Fat-tailed distributions - These distributions frequently fail a critical assumption underlying the law of large numbers - namely, that

    $$\sum_{n=1}^\infty \frac{Var(X_n)}{n^2} < \infty$$

    One example that fails this is the Pareto distribution with $\alpha \in (1,2]$ (which has a divergent variance). This happens _all the time_ - here's a typical example from Terry Tao's blog (2009). Likewise, fat-tailed distributions that do satisfy the (weak) law of large numbers only converge to the predicted asymptote rather slowly, rendering the principle ineffective (see Weron et al, International Journal of Modern Physics C (2001))

    1. Flicker Noise (aka 1/f Noise) - Flicker noise is another ubiquitous phenomenon that fails the law of large numbers. Example: A perfect hour-glass has an average clock-drift of 0 seconds. However, clock drift is governed by flicker noise, which fails to satisfy the assumptions behind the (weak) law of large numbers. As a consequence, no matter how many hour-glasses you have, you can't be more accurate than an atomic clock. This example is from Bill Press (of Numerical Recipes fame), who wrote something of a classic on this subject: Flicker Noises in Astronomy and Elsewhere (1978)

Of course, as you point out in your chapter, one can hold the Law of Large Numbers at arms length with Bayesian analysis.

CamDavidsonPilon commented 11 years ago

one can hold the Law of Large Numbers at arms length with Bayesian analysis.

Well put. You're correct, I do discard relatively common occurences (Pareto, Zipf) that may fail the LLN. What I should do is explore an example involving a fat-tailed distribution (I was planning on this for a later chapter, but Chp4 might be a better place), and show off the median statistic vs. the mean statistic.

I'll retract the statement about humoring mathematicians ;)

rsvp commented 7 years ago

Many cases of leptokurtotic ("fat-tailed") distributions, except truly pathological infinite-variance Levy distributions ("humoring mathematicians"), can be resolved by using Gaussian mixture distributions. The intuitive idea is that each Gaussian distribution in the mixture has some probability of being drawn from it. The details are here: https://git.io/gmix

The statistical problem is the estimation of the GM(n) parameters. For small-sample sizes, an illustration of Bayesian estimation techniques would be wonderful. This will turn out to be an exercise in handling extreme data values, which are frequently mistaken for being outliers. When robustness is desired for estimating centrality, the median is easily demonstrated to be appropriate.