rkneusel9 / MathForDeepLearning

Source code for the book "Math for Deep Learning" (No Starch Press)
MIT License
121 stars 45 forks source link

Logical bug in Chapter 2 / Boston code #5

Open ikimmit opened 6 months ago

ikimmit commented 6 months ago

Hi! I'm loving the book, but found a logical error in one of the exercises, namely the Boston problem of Chapter 2.

This line will occasionaly sample the same person repeatedly:

s = np.random.randint(0,50,3)

A way to fix it is using this instead:

s = np.random.choice(50, 3, replace=False)

You then comment on the results:

No Boston in the fall = 0.7780  
which is close enough to our calculated value to give us confidence that we found the correct answer.

But that value is actually a telltale of the underlying bug. I tried the fix and the values converge to 0.7745 as they should.

This is my full code for testing (using a more modern version of numpy with RNG):

rng = np.random.default_rng()
nb = 0
N = 50000000
for i in range(N):
    s = rng.choice(50, 3, replace=False)
    fail = False
    for t in range(3):
        if s[t] < 4:
            fail = True
    if not fail:
        nb += 1
print("No Boston in the fall = %0.4f" % (nb / N,))
rkneusel9 commented 6 months ago

Good catch! Yes, resampling the same person was skewing the simulation results. The code in boston.py has been updated. Twenty runs produces 0.77425 +/- 0.00033 (mean +/- SE), inline with the calculated probability of 0.7745.