Feature suggestion, advice wanted: Word Embedding Association Tests to report implicit bias in spaCy models

SandyRogers commented 6 years ago

I posted this idea over on the spaCy repo but it didn't get picked up there (spaCy issue 2237) . So perhaps it is suitable for this package instead...

Feature: bias tests for spaCy language models

Embedding models are implicitly biased by their training data.

People tend to have biases that cause them to associate certain words with certain attributes. For example, Maths and Science words (['physics', 'chemistry', 'NASA']) are more quickly associated with Male attributes (['uncle', 'brother', 'him']) then the equivalent Female attributes.

This is what the Implicit Association Test measures.

Caliskan et al 2017 (PDF) show how an equivalent test (the Word Embedding Association Test, WEAT) can be applied to measure the bias in embedding models. Google developers just showed some results for word2vec and glove in a blog post too.

I thought that perhaps spaCy should report WEAT scores on their releases page. But since the idea didn't get picked up by spaCy yet, maybe this could be a feature of textacy?

I'm not sure the best place to do this in textacy - probably something like textacy/bias.py or perhaps textacy/spacy_utils.py? I've put some code together to calculate some of the WEAT tests on spaCy models in this gist.

I'm happy to do a PR if this is of interest to the library!

bdewilde commented 6 years ago

Hey @SandyRogers , thanks for the suggestion and the code snippet. I saw a presentation about WEAT last fall, and it's been in the back of my mind since. There's nothing like this currently in textacy, so I don't have a good sense of how to incorporate it — or whether or not it makes sense to do so.

I promise to give this more thought, but it might take a while to work up my priorities list. Thanks in advance for your patience.

SandyRogers commented 6 years ago

Hey @bdewilde thanks for giving this some thought! You and others might be interested in this review of recent research on the topic from The Gradient

Also just to note that the issue I raised at spaCy was closed. The core team weren't sure that WEAT was mature and robust enough of an idea to adopt.

The Gradient piece is interesting because it points at some suggestions of how to remove biases in the embedding vector space, which perhaps gives us something to try and implement either in textacy or elsewhere.

chartbeat-labs / textacy

Feature suggestion, advice wanted: Word Embedding Association Tests to report implicit bias in spaCy models #195

Feature: bias tests for spaCy language models