princeton-nlp / MABEL

EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975
MIT License
37 stars 2 forks source link

NLI Benchmark Dataset Request #5

Closed Elfsong closed 10 months ago

Elfsong commented 10 months ago

Hi there,

Could you share the processed version of the Bias-NLI dataset? it needs to request access, I submitted my request already. Thank you!

Best regards, Mingzhe

jacqueline-he commented 10 months ago

Hi,

Sorry for missing your request--the processed NLI dataset can be accessed here. I believe the former Drive link led to a version of the dataset that was not shuffled properly, which affects the distributions in the train / val / test partitions.

Alternatively, you can re-generate the dataset by following the instructions from the original source here. You could do some light pre-processing to get the columns formatted as [premise, hypothesis, label], in which the label value should always be entailment (or 1), and then shuffle all the examples.

Elfsong commented 10 months ago

Really appreciate! Thank you for making this solid work.