CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.71k stars 7.87k forks source link

Overly complicated text message example #69

Open CamDavidsonPilon opened 11 years ago

CamDavidsonPilon commented 11 years ago

From Ghassen HAMROUNI, direct communication

I have a remark concerning the first chapter: The example "Inferring behaviour from text-message data" is certainly interesting. But I think it is over-complicated. In fact the reader has just learned about the Poisson process. And he is expecting a simple application where he can infer the value of the constant λ. But instead you introduce an inhomogeneous Poisson process where the λ(t) is a time-dependent function !

alexgarel commented 11 years ago

I didn't find it too much complicated for myself, and I thinks that's a quite satisfying example, which shows power of Bayesian tools.

CamDavidsonPilon commented 11 years ago

I agree with @alexgarel. It is a small mental step from a model one lambda parameter (which is very trivial) to two parameters (which is very difficult in freq. methods, but simple here). I'll wait for another opinion/suggestion before I close this.

xcthulhu commented 11 years ago

Well, I agree that it's too complicated. Text messaging is not really governed by Poisson processes. Wouldn't it make more sense to have data driven by a simple, unchanging Poisson process, and then use Bayesian inference to estimate lambda?

How about looking at photon arrival counts per minute for the Chandra space telescope (as I have suggested elsewhere)? These are well known to follow simple Poisson statistics. The Crab Nebula might be a good candidate, since Chandra uses it for calibration:

Crab Nebula

Deep field observations are even better, since they are much fainter.

CamDavidsonPilon commented 11 years ago

Lovely picture. Estimating a single lambda is a trivial exercise. Multiple lambda with an unknown switchpoint is a difficult frequentist procedure, but simple in probabilistic programming, which is why I used it as a first example.

CamDavidsonPilon commented 11 years ago

Perhaps I can put an smaller example, like the nebula example, before the text message example.

xcthulhu commented 11 years ago

Sounds good; that way the reader gets warmed up with something easy/well behaved before encountering a far more complicated model.

It wasn't clear that you wanted a difficult-for-frequentist-statistics example. Here are a few others that are easy probabilistic programming activities, but hard for frequentist analysis: