oldoc63 / learningDS

Learning DS with Codecademy and Books
0 stars 0 forks source link

Using the Probability Mass Function Over a Range #424

Open oldoc63 opened 1 year ago

oldoc63 commented 1 year ago

We have seen that we can calculate the probability of observing a specific value using a probability mass function. What if we want to find the probability of observing a range of values for a discrete random variable? One way we could do this is by adding up the probability of each value.

For example, let's say we flip a coin 5 times, and want to know the probability of getting between 1 and 3 heads. We can visualize this scenario with the probability mass function:

Image

oldoc63 commented 1 year ago

We can calculate this using the following equation where P(x) is the probability of observing the number x successes (heads in this case):

$$ P(1 to 3 heads) = P(1 < X >= 3) $$

$$ P(1to3heads)=P(X=1)+P(X=2)+P(X=3) $$

$$ P(1to3heads)=0.1562+0.3125+0.3125 $$

$$ P(1to3heads)=0.7812 $$

oldoc63 commented 1 year ago

Probability Mass Function Over a Range Using Python

We can use the same binom.pmf() method from the scipy.stats library to calculate the probability of observing a range of values. As mentioned previously, the binom.pmf method takes 3 values:

For example, we can calculate the probability of observing between 2 and 4 heads from 10 coin flips as follows:

oldoc63 commented 1 year ago

We can also calculate the probability of observing less than a certain value, let's say 3 heads, by adding up the probabilities of the values below it:

oldoc63 commented 1 year ago

Note that because our desired range is less than 3 heads, we do not include that value in the summation.

When there are many possible values of interest, this task of adding up probabilities can be difficult. If we want to know the probability of observing 8 or fewer heads from 10 coin flips, we need to add up the values from 0 to 8:

oldoc63 commented 1 year ago

This involves a lot of repetitive code. Instead, we can also use the fact that the sum of the probabilities for all possible values is equal to 1:

$$ P(0 to 8 heads) + P(9 to 10 heads) = P(0 to 10 heads) = 1 $$

$$ P(0 to 8 heads) = 1 - P(9 to 10 heads) $$

Now instead of summing up 9 values for the probabilities between 0 and 8 heads, we can do 1 minus the sum of two values and get the same result: