dccuchile / wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
https://wefe.readthedocs.io/
MIT License
173 stars 14 forks source link

Information about the pre-loaded wordsets (Dataloaders) #24

Closed harshvr15 closed 2 years ago

harshvr15 commented 3 years ago

Hi Pablo, Greetings!

1). I was trying to get results on multi-lingual fastText embeddings through the query using WEAT_wordsets() and RND_wordsets(). I am not sure whether I can use the same set of query for multi-lingual embeddings. However, I am getting some result for this.

WEAT_wordsets = load_weat() RND_wordsets = fetch_eds()

glove_embedding = WordEmbeddings('de') glove_keyed_vectors = glove_embedding.precomputed_word_embeddings glove_100 = WordEmbeddingModel(glove_keyed_vectors, 'de')

result = weat.run_query(gender_3, glove_100, lost_vocabulary_threshold=0.3, warn_not_found_words=True, calculate_p_value=True, p_value_iterations=10000) print(result)

RESULT: {'query_name': 'Male terms and Female terms wrt Science and Arts', 'result': 0.50751936, 'weat': 0.50751936, 'effect_size': 0.9096829, 'p_value': 0.0034996500349965005}

But I am not sure whether it is correct? Can you please guide whether these wordset queries can be used for multi-lingual embeddings and if not how can I proceed?

2). Can you please share any links where I can find information about the pre loaded wordsets ( load_weat() and fetch_eds() )?

For now, I just know: fetch_eds(): Fetches the word sets used in the experiments of the work Word Embeddings *Quantify 100 Years Of Gender And Ethnic Stereotypes. load_weat(): Loads the word sets used in the paper Semantics Derived Automatically From Language Corpora Contain Human-Like Biases.

I couldn't find any links on what data are they extracted from or what sets does it have and further information.

Furthermore,

For example: In this query: gender_1 = Query( [RND_wordsets["male_terms"], RND_wordsets["female_terms"]], [WEAT_wordsets["career"], WEAT_wordsets["family"]], ["Male terms", "Female terms"], ["Career", "Family"],

(Correct me if I'm wrong) You are taking male_terms and female _terms as target set and career and family as attribute sets from these dataloaders. What are the other sets (like male_terms, adjectives_intelligence, female_terms etc.) which are there in this dataloaders? You have described a few in your queries, are there only those or are there few others which can be considered?

Hope to hear from you soon.

Thanks, Harsh

pbadillatorrealba commented 2 years ago

Hi @harshvr15 ,

A thousand apologies for the delay!

I answer you per question:

  1. Queries are only valid when the language of the words is the same as the language of the embeddings model.

    As far as I understand, in the example you show you are generating queries from the words loaded by the dataloaders (which are in English) and then using them to measure the bias on a model that was trained for another language.

    The reason why you get results is because these words exist in the model you are using since it is very common that English words are in the vocabularies of models trained for other languages. However, the fact that the metrics return results does not at all guarantee that you are measuring bias well, since they only measure bias in English, not in your target language.

    The words in the original language should contain more information about the biases of this particular language, so I would advise you to generate queries in the language you want to study.

    I would recommend in the case of gender to translate the sets into the language you are studying, validate the translations and then run the queries using those translated words. For the case of ethnicity and religion (which are very specific to the culture associated to the language studied), I would first recommend you to analyze if the queries are valid and then I would recommend you to translate them. It would not make much sense for example to analyze US ethnic bias in an embeddings model trained in Russian (obviously this is just an example, it would have to be checked).

  2. The sources are:

>>> from wefe.datasets import load_weat, fetch_eds
>>> weat_datasets = load_weat()
>>> weat_datasets.keys()

dict_keys(['flowers', 'insects', 'pleasant_5', 'unpleasant_5', 'instruments', 'weapons', 'european_american_names_5', 'african_american_names_5', 'european_american_names_7', 'african_american_names_7', 'pleasant_9', 'unpleasant_9', 'male_names', 'female_names', 'career', 'family', 'math', 'arts', 'male_terms', 'female_terms', 'science', 'arts_2', 'male_terms_2', 'female_terms_2', 'mental_disease', 'physical_disease', 'temporary', 'permanent', 'young_people_names', 'old_people_names'])
>>> eds_datasets = fetch_eds()
>>> eds_datasets.keys()

dict_keys(['adjectives_appearance', 'adjectives_otherization', 'adjectives_sensitive', 'names_asian', 'names_black', 'names_chinese', 'names_hispanic', 'names_russian', 'names_white', 'words_christianity', 'words_islam', 'words_terrorism', 'male_occupations', 'female_occupations', 'occupations_white', 'occupations_black', 'occupations_asian', 'occupations_hispanic', 'male_terms', 'female_terms', 'adjectives_intelligence'])

I hope you find this answer helpful.

Greetings, Pablo.

harshvr15 commented 2 years ago

Hi Pablo, Greetings!

Thanks for the update. That was really helpful.

One more question, I can see in your research paper (Table 1: Final matrices obtained after applying our framework for several metrics, embedding models, and three different query sets. Rankings plus absolute values for each metric are included.), the religious query for the conceptnet-numberbatch model yields result as WEAT(0.96) and WEAT-ES(0.11). The difference is almost 9 times for the same model, what can be the possible reasons for these contradictions or how can we explain this, as I can see some other contradictions as well?

Furthermore, I'm unable to interpret this for WEAT-ES : Since the ideal is also 0, we define ≤FWEAT-ES just as ≤FWEAT , can you please elaborate? Also, for RNSB.

Awaiting your response.

Thanks Harsh

pbadillatorrealba commented 2 years ago

Hi @harshvr15

The difference is almost 9 times for the same model, what can be the possible reasons for these contradictions or how can we explain this, as I can see some other contradictions as well?

The main difference between WEAT and WEAT-ES is that WEAT-ES normalizes the results obtained using the average and standard deviation of the calculated statistics. This makes the WEAT-ES calculated scores independent of the number of words that each target and attribute possesses, which in practical terms allows obtaining much more robust results than using WEAT alone. This difference may explain the orders of magnitude difference between the different tests.

I have just updated the documentation with a better explanation of this: https://wefe.readthedocs.io/en/latest/about.html#weat

Furthermore, I'm unable to interpret this for WEAT-ES : Since the ideal is also 0, we define ≤FWEAT-ES just as ≤FWEAT , can you please elaborate? Also, for RNSB.

On the other hand, the inequalities remain the same: the lower the absolute value of the score, the lower the bias detected.

In the case of RNSB, I think it is better to understand how the metric works :

Originally this metric is based on measuring bias through word sentiment. The main idea is that if there were no bias, all words should be equally negative. Therefore, its procedure is based on calculating how negative the words in the target sets are.

For this purpose, RNSB trains a classifier that assigns a probability to each word of belonging to the negative class (in the original work the classifier is trained using Bing Liu’s lexicon of positive and negative words). Then, it generates a probability distribution with the probabilities calculated in the previous step and compares them to the uniform distribution (case where all words have the same probability of being negative) using KL divergence. When the negative probability distribution is equal to the uniform one (i.e., there is no bias), the KL divergence is 0.

I also updated the documentation of this metric with the idea of making it a bit more understandable: https://wefe.readthedocs.io/en/latest/about.html#rnsb

Hope you find it useful 😊

Pablo.