Add NLPositionality LabInTheWild datasets

mainlp / awesome-human-label-variation

A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, accompanying The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation (EMNLP 2022)

76 stars 8 forks source link

Add NLPositionality LabInTheWild datasets #11

Open rmovva opened 11 months ago

rmovva commented 11 months ago

I'm suggesting inclusion of the datasets from the NLPositionality paper, which contains disaggregated annotator judgments on the Social Chemistry dataset (acceptability of various social situations) and the DynaHate toxicity dataset. In their data, annotators are also linked to demographic information, like age, gender, native language, etc. The dataset also contains annotations from a few language models.