dccuchile / wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
https://wefe.readthedocs.io/
MIT License
173 stars 14 forks source link

WEAT p-value is nan #15

Closed raffaem closed 3 years ago

raffaem commented 3 years ago

Hello,

the WEAT p-value is always nan.

This happened for two different models I have tried.

Is it normal?

pbadillatorrealba commented 3 years ago

Hello @raffaem ,

I'm pretty sure it shouldn't happen.

Could you send me a little more context, like the code you are trying to run?

Best regards, Pablo.

raffaem commented 3 years ago

@pbadillatorrealba

The code is just on the line of:

weat = WEAT()
wefemodel = WordEmbeddingModel(wv, model_name)
query = Query(target_sets_2, attribute_sets_2, target_sets_names, attribute_sets_names)
result_weat = weat.run_query(query, wefemodel)

I now know that if some target words or some attribute words are not present in the word embedding, everything will be nan: the weat, the effect_size and the p_value will all be nan.

But I now made sure that every target and attribute word is present in the word embedding: the weat and effect_size seems ok, but the p-value remains nan.

raffaem commented 3 years ago

May it be due to the fact that the two target sets don't have the same size, or that the two attribute sets don't have the same size?

raffaem commented 3 years ago

Forget it, I just didn't set compute_p_value=True