fivethirtyeight / data

Data and code behind the articles and graphics at FiveThirtyEight
https://data.fivethirtyeight.com/
Creative Commons Attribution 4.0 International
16.74k stars 10.94k forks source link

The description of the bachlorette data does not match data #210

Closed SaskiaFreytag closed 1 year ago

SaskiaFreytag commented 5 years ago

When I load the bachelorette data I see categories 2,3, ... ,15 in the columns entitled elimination_1, elimination_2, ..., dates_10. This does not match the documentation that says I should see R, E, EQ, .., D1 and so on.

asewnath commented 5 years ago

The columns that are titled ELIMINATION_1 through ELIMINATION_10 represent each week of eliminations on the show. Within the rows of the elimination columns, where each row represents a contestant on the show, if a contestant doesn’t have an empty field, then they’ll have either “E” for standard elimination, “E” accompanied by other letters that represent how they left the show, or an “R” denoting that they received a rose. I find the rose data a bit confusing because technically everyone who stays in the competition receives a rose each week and that isn’t conveyed in the elimination data. Also, some seasons don’t have this rose data. If this was for date roses, it should’ve been described in the date columns. An asterisk on someone's name can imply the first impression rose, since there is no real date associated with that.

Similar to the elimination columns, the DATES-1 through DATES-10 columns represent each week of dates. An empty field in a row represents no dates, while a “D” accompanied with a number represents the number of people that were on the date. The types of eliminations and dates can be found in the dataset’s README. Hope this helps!