dccuchile / wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework that standardizes the bias measurement and mitigation in Word Embeddings models. Please feel welcome to open an issue in case you have any questions or a pull request if you want to contribute to the project!
https://wefe.readthedocs.io/
MIT License
173 stars 14 forks source link

word_embedding not found under wefe #18

Closed harshvr15 closed 3 years ago

harshvr15 commented 3 years ago

ImportError: cannot import name 'word_embedding' from 'wefe' (/Users/xx/PycharmProjects/pythonProject/venv/lib/python3.8/site-packages/wefe/init.py)

please help :)

pbadillatorrealba commented 3 years ago

Hi @harshvr15 ,

Could you provide a little more context or code?

Best Regards Pablo.

harshvr15 commented 3 years ago

Hi Pablo ,

According to the documentation of WEFE, I need to import ; from wefe.word_embedding import (This was used beacuse in the documentation it is just from wefe.word_embedding import. Could you please help me regarding:

from wefe.word_embedding import ....... ? Also if I used an * there, it throws an error saying: ImportError: cannot import name 'word_embedding' from 'wefe' (/Users/xx/PycharmProjects/pythonProject/venv/lib/python3.8/site-packages/wefe/init.py)

Thanks Harsh

pbadillatorrealba commented 3 years ago

WordEmbeddingModel is a wrapper for pre-trained embedding models loaded with Gensim's KeyedVector. The idea is that this class allows WEFE to interact with the model. To import WordEmbeddingModel try with:

from wefe.word_embedding_model import WordEmbeddingModel

model = WordEmbeddingModel(some_gensims_keyed_vectors, "model_name")

I hope I was able to answer your question. If not, I can still help you :)

Pablo.

harshvr15 commented 3 years ago

Thanks Pablo, Resolved!

Harsh

harshvr15 commented 3 years ago

Hi Pablo,

I was just wondering if there are word embeddings for other languages as well which can be used through WEFE for investigating the bias, if you have come across?

Harsh

pbadillatorrealba commented 3 years ago

Hi @harshvr15

Of course! You can load any embedding model that is compatible with gensim. This link contains the official tutorial on how to load embeddings in word2vec format (via KeyedVectors.load_word2vec_format):

https://radimrehurek.com/gensim/models/word2vec.html#usage-examples

Flair offers a wide variety of embeddings in different languages:

https://github.com/flairNLP/flair/blob/master/resources/docs/embeddings/CLASSIC_WORD_EMBEDDINGS.md

Finally, there are a variety of github repositories that publish their own embeddings which can also be tested. In the case of Spanish, @dccuchile has a complete list of them:

https://github.com/dccuchile/spanish-word-embeddings

I hope you found this useful

Pablo.

harshvr15 commented 3 years ago

Hi Pablo,

Thank you for your valuable input. I implemented a few multilingual embeddings via Flair. When I am manually giving the Target and Attribute sets, I am getting some values as shown below. But when I am giving the RND_wordsets or WEAT_wordsets Iam getting this error:

ERROR:root:At least one set of 'Male terms and Female terms wrt Science and Arts' query has proportionally fewer embeddings than allowed by the lost_vocabulary_threshold parameter (0.2). This query will return np.nan. {'query_name': 'Male terms and Female terms wrt Science and Arts', 'result': nan, 'weat': nan, 'effect_size': nan}

Is these wordsets only defined for english? If yes, in what ways can it be modified for other languages (if you can advice me on how to proceed)?

For Manual target and attribute sets

target_sets = [['she', 'woman', 'girl', 'her'], ['he', 'man', 'boy','him']] target_sets_names = ['Female Terms', 'Male Terms']

attribute_sets = [['poetry', 'dance', 'literature'], ['math', 'physics', 'chemistry']] attribute_sets_names = ['Science', 'Arts']

{'query_name': 'Female Terms and Male Terms wrt Science and Arts', 'result': 0.19488816, 'weat': 0.19488816, 'effect_size': 0.8827046, 'p_value': ### nan}

Also, my second concern is p-value. Every time, I am getting the p-value as nan. Could there be any possible reason for that or somewhere I am going wrong?

Thanks Harsh

pbadillatorrealba commented 3 years ago

Hi Harsh

I will answer your questions in parts:

When I am manually giving the Target and Attribute sets, I am getting some values as shown below. But when I am giving the RND_wordsets or WEAT_wordsets Iam getting this error:

ERROR:root:At least one set of 'Male terms and Female terms wrt Science and Arts' query has proportionally fewer embeddings than allowed by the lost_vocabulary_threshold parameter (0.2). This query will return np.nan. {'query_name': 'Male terms and Female terms wrt Science and Arts', 'result': nan, 'weat': nan, 'effect_size': nan}

According to what you are describing, by using RND_wordsets and WEAT_wordsets, some set of your query is missing 20% or more words, so the query is self-declared as invalid and returns nan.

To see which sets are missing, when running run_query you can set the parameter warn_not_found_words=True :

result = WEAT().run_query(query, model, warn_not_found_words=True)

The allowed amount of words to be lost can be controlled through lost_vocabulary_threshold. In the following case, a loss of 30% is allowed:

result = WEAT().run_query(
    query, model, lost_vocabulary_threshold=0.3, warn_not_found_words=True
)

On the other hand, there are some embeddings models that are not cased or whose words do not contain accented characters. You can convert all your words of all embeddings to lowercase or without accents using a preprocessor:

result = WEAT().run_query(
    query, 
    model, 
    preprocessor_args={
        "lowercase": True, 
        "strip_accents": True
        }
)

Regarding this question:

Is these wordsets only defined for english? If yes, in what ways can it be modified for other languages (if you can advice me on how to proceed)?

Unfortunately, all studies are conducted with English-related words and biases in mind.

A possible approach to solve this problem would be to translate the words of each set to the language of your choice, but you would have to verify the validity of each set in each case. While it would seem that there would be no problem with translating gender-biased words (since this problem seems to be global), it may be that other criteria (such as ethnicity) are not directly translatable. For example, it may be that the ethnicity queries do not make sense in another language since the bias studied in this case is closely related to the racial problems in the USA.

Another valid approach would also be to create your own sets of words for the biases you wish to study. However, I would also guess that it requires other validation methods that are beyond the scope of WEFE.

Regarding this question:

Also, my second concern is p-value. Every time, I am getting the p-value as nan. Could there be any possible reason for that or somewhere I am going wrong?

The p-value calculation is by default disabled in WEAT as it is a very slow operation (by default, it calculates many permutations of your query and evaluates if any score of these permuted queries is greater than your original query). To enable it, you must supply run_query() calculate_p_value=True:

result = WEAT().run_query(
    query, model, calculate_p_value=True, p_value_iterations=10000,
)

Best wishes Pablo.

harshvr15 commented 3 years ago

Thanks a million for the help, Pablo.

liaocs2008 commented 3 years ago

@pbadillatorrealba can you list all your run_query settings at least for embeddings mentioned in your paper? I notice it fails in the same way even just for glove-wiki-gigaword-300