oldoc63 / learningDS

Learning DS with Codecademy and Books
0 stars 0 forks source link

Value Proportions for Categorical Data #397

Open oldoc63 opened 1 year ago

oldoc63 commented 1 year ago

A counts table is one approach for exploring categorical variables, but sometimes it is useful to also look at the proportion of values in each category. For example, knowing that there are 3,539 rental listings in Manhattan is hard to interpret without any context about the counts in the other categories. On the other hand, knowing that Manhattan listings make up 71% of all New York City listings tells us a lot more about the relative frequency of this category.

We can calculate the proportion for each category by dividing its count by the total number of values for that variable:

oldoc63 commented 1 year ago

Alternatively, we could also obtain the proportions by specifying normalize=True to the .value_counts() method:

oldoc63 commented 1 year ago

Using the movies DataFrame, find the proportion of movies in each genre and save them to a variable called genre_props. Print genre_props to see the result.