AllenDowney / ThinkBayes2

Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.
http://allendowney.github.io/ThinkBayes2/
MIT License
1.8k stars 1.49k forks source link

What is considered to be the "data" in the M&M solution? (Ch 2: Bayes's Theorem) #55

Closed rigdern closed 1 year ago

rigdern commented 2 years ago

The problem hint indicates the trick is defining the hypotheses and data carefully:

Hint: The trick to this question is to define the hypotheses and the data carefully.

The solution clearly states the "hypotheses":

# Hypotheses:
# A: yellow from 94, green from 96
# B: yellow from 96, green from 94

What is the "data" in this solution?


My guess is the data is the color mixes in the bags (e.g. 1994 bag has 30% brown, etc. 1996 bag has 24% blue, etc.).

With this data definition, the first likelihood would be represented by "the probability of the color mix given that yellow is from 94 and green is from 96". It's not clear to me why that probability would be represented by 0.2*0.2. Maybe I have to think more or maybe I've guessed wrong about the data definition.

HelloWorld183L commented 2 years ago

I wouldn't say I have the most thorough understanding of this myself but I think the data is referring to the color mixes as you have said. I would look at probability trees for a more intuitive understanding of the multiplication taking place. What part of the probability being represented by 0.2*0.2 isn't clear?

AllenDowney commented 1 year ago

I would say that the mixes in the bags are background information, and the data is the color of the two M&Ms that were drawn. But the line between data and background information can be fuzzy. The important part is that you can compute the likelihood of the outcome under each hypothesis.