2018, EMNLP, Reducing Gender Bias in Abusive Language Detection

Rounique commented 2 years ago

2018-Reducing Gender Bias in Abusive Language Detection.pdf

hosseinfani commented 2 years ago

@Rounique your summary?

Rounique commented 2 years ago

As I mentioned, I didn't do summaries up to now. I'll do them after I finish the assignments I have for this week.

Rounique commented 2 years ago

fani-lab/fair_team_formation#11

hosseinfani commented 2 years ago

@Rounique Any update?

Rounique commented 2 years ago

Title: Reducing Gender Bias in Abusive Language Detection Venue: EMNLP Year: 2018

Introduction As the use of social media and online platforms is increasing, people tend to share their ideas and words more and more. Here, automatic detection of abusive language plays an important role since abusive language can lead to cyber-bullying, personal trauma, hate crime, and discrimination. Therefore, using machine learning and NLP to automatically detect abusive language is useful for many websites or social media services.

In this paper, gender bias has been measured on models that are trained with abusive language datasets, and also some methods have been introduced for mitigating these biases. The bias measuring is done with a generated unbiased test set and the mitigating methods are: (1) debiased word embedding, (2) gender swap data augmentation, (3) fine-tuning with a larger corpus

Dataset: Sexist Tweets, Abusive Tweets.

Measuring Gender Biases It is not possible to measure gender bias on a dataset on which the model has been trained since it will follow the same biases. Therefore, it is necessary to generate an unbiased test set. the test set generated in this work includes 1,152 samples (576 pairs) by filling the templates with common gender identity pairs (ex. male/female, man/woman, etc.). Some templates have been generated that contain both neutral and offensive nouns and adjectives inside the vocabulary to retain balance in neutral and abusive samples.

Mitigating Bias

Word Embeddings (DE) This is an algorithm to correct word embeddings by removing gender-stereotypical information.

Gender Swap (GS) What is basically done here is the identify male entities and swap them with equivalent female entities and vice-versa. This simple method removes the correlation between gender and classification decisions and has proven to be effective for correcting gender biases.

Bias fine-tuning (FT) A method to use transfer learning from a less biased corpus to reduce the bias. Initially, a model is trained with a less-biased and larger source corpus and fine-tuned with a target corpus.

Metric used: AUC

Conclusion It is found that these proposed methods can reduce gender biases up to 90-98%, improving the robustness of the models.

Future Work Increasing classification performance and reducing the bias at the same time.

Codes https://github.com/conversationai/unintended-ml-bias-analysis

Rounique commented 2 years ago

The summary is added.

hosseinfani commented 2 years ago

@Rounique please explore their codeline, there are more good info : https://perspectiveapi.com/how-it-works/ https://conversationai.github.io/

fani-lab / Adila

2018, EMNLP, Reducing Gender Bias in Abusive Language Detection #12