oldoc63 / learningDS

Learning DS with Codecademy and Books
0 stars 0 forks source link

Contingency Tables: Frequencies #409

Open oldoc63 opened 1 year ago

oldoc63 commented 1 year ago

Contingency tables, also known as two way tables or cross tabulations, are useful for summarizing two variables at the same time. For example, suppose we are interested in understanding whether there is an association between influence (whether a person thinks they have a talent for influencing people) and leader (whether they see themselves as a leader). We can use the crosstab function from pandas to create a contingency table:

oldoc63 commented 1 year ago

This table tells us the number of people who gave each possible combination of responses to these two questions. For example, 2360 people said that they do not see themselves as a leader but have talent for influencing people.

To assess whether there is an association between these two variables, we need to ask whether information about one variable give us information about the other. In this example, we see that among people who don't see themselves as a leader (the first column), a larger number (3015) don't think they have a talent for influencing people. Meanwhile, among people who do see themselves as a leader (the second column), a larger number (4429) do think they have a talent for influencing people.

So, if we know how someone responded to the leadership question, we have some information about how they are likely to respond to the influence question. This suggest that the variables are associated.

oldoc63 commented 1 year ago

Do you think there will be an association between special (whether or not a person sees themself as special) and authority (whether or not a person likes to have authority)? Create a contingency table for these two variables and store the table as especial_authority_freq, then print out the result.