Dataset Details:
The Multi-Dimensional Gender Bias Classification dataset is constructed using a comprehensive framework that dissects gender bias within text across various pragmatic and semantic aspects. This includes bias related to the gender of the individual mentioned, the gender of the individual addressed, and the gender of the speaker. This dataset encompasses seven extensive datasets that have been automatically labeled with gender-related information (note that the HuggingFace distribution omits one dataset from the original project, which is the Wikipedia set). Additionally, it includes a crowdsourced evaluation benchmark for gender rewrites at the utterance level, a compilation of gendered names, and a catalog of gendered English words.
Dataset Details: The Multi-Dimensional Gender Bias Classification dataset is constructed using a comprehensive framework that dissects gender bias within text across various pragmatic and semantic aspects. This includes bias related to the gender of the individual mentioned, the gender of the individual addressed, and the gender of the speaker. This dataset encompasses seven extensive datasets that have been automatically labeled with gender-related information (note that the HuggingFace distribution omits one dataset from the original project, which is the Wikipedia set). Additionally, it includes a crowdsourced evaluation benchmark for gender rewrites at the utterance level, a compilation of gendered names, and a catalog of gendered English words.
Dataset URL: https://huggingface.co/datasets/md_gender_bias