rudeboybert / fivethirtyeight

R package of data and code behind the stories and interactives at FiveThirtyEight
https://fivethirtyeight-r.netlify.app/
Other
453 stars 104 forks source link

[ratings dataset] Ambiguity in the "category" column description. #35

Open OmaymaS opened 5 years ago

OmaymaS commented 5 years ago

I think that the description category column in the rating dataset might be ambiguous .

> levels(ratings$category)
 [1] "Aged 18-29"         "Aged 30-44"         "Aged 45+"           "Aged under 18"      "Females"           
 [6] "Females Aged 18-29" "Females Aged 30-44" "Females Aged 45+"   "Females under 18"   "IMDb staff"        
[11] "IMDb users"         "Males"              "Males Aged 18-29"   "Males Aged 30-44"   "Males Aged 45+"    
[16] "Males under 18"     "Non-US users"       "Top 1000 voters"    "US users"    

Because there could be questions like:

I checked an example on IMDB, but I am not sure how things sum up in the dataset.

demo

rudeboybert commented 5 years ago

Hey @OmaymaS, thanks for the heads up. fivethirtyeight::ratings is simply a repackaging of the original data shared by 538 on their GitHub data page here, so I think it would make more sense to get this addressed upstream first, and then update the downstream package accordingly. Could you create an issue on fivethirtyeight/data and tag me @rudeboybert?