anvaka / map-of-reddit-data

Contains scripts and data to render map of reddit
93 stars 13 forks source link

Remove bias in labelling #17

Open brienna opened 3 years ago

brienna commented 3 years ago

This project is really cool. While I recognize that there is no right way to cluster the Reddit comments, I am concerned about the labels, particularly the label "Sick."

"Sick" does not reflect the composition of that cluster. For example, the subreddit "deaf" is about an entire community and culture. The subreddits "ASL" and "BSL" are about languages.

There are clear biases within the data (which is really interesting to see). The least we can do is avoid introducing more bias through labelling.

It can be tricky to choose the right label, but it would be good to use something less negative, perhaps "Health" or "Disability Culture."