CheckList for HateSpeech Detection

marcotcr / checklist

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

MIT License

2.01k stars 204 forks source link

Hi again, how would I manage the labels to use CheckList to produce tests for the HateSpeech Detection task? I mean that HateSpeech detection classifiers usually only return label 0 (i.e. not hateful) and label 1 (hateful content) as output. From the examples of tasks in the repository, in particular I'm thinking about the most similar task of Sentiment Analysis, 3 labels are involved.
Should I implicitly encode the hateful samples with label 0 (negative in your framework) and the non-hateful ones with label 2 (positive) and/or 1 (neutral)? Since there's usually no neutral concept in HateSpeech Detection... Or is there a way to decline CheckList on the two labels within the task of HateSpeech?

Thank you very much

marcotcr / checklist

CheckList for HateSpeech Detection #70