facebookresearch / MUSE

A library for Multilingual Unsupervised or Supervised word Embeddings
Other
3.18k stars 552 forks source link

AssertionError Supervised and Unsupervised #69

Closed jdvala closed 6 years ago

jdvala commented 6 years ago

I have my source and target word embedding built from fast text I have parallel dictionary for the same, but when I run them I have this INFO - 07/28/18 12:42:25 - 0:00:00 - The experiment will be stored in /home/jay/MUSE/dumped/debug/iqo0anky9o Traceback (most recent call last): File "supervised.py", line 73, in <module> src_emb, tgt_emb, mapping, _ = build_model(params, False) File "/home/jay/MUSE/src/models.py", line 46, in build_model src_dico, _src_emb = load_embeddings(params, source=True) File "/home/jay/MUSE/src/utils.py", line 406, in load_embeddings return read_txt_embeddings(params, source, full_vocab) File "/home/jay/MUSE/src/utils.py", line 280, in read_txt_embeddings assert _emb_dim_file == int(split[1]) AssertionError

I hope I am not doing something wrong here

glample commented 6 years ago

It looks like the dimension in your embeddings file is not the one you gave to MUSE. Default value is 300. Otherwise, use --emb_dim 512 if the dimension of your embeddings is 512, for instance.

jdvala commented 6 years ago

Got that to work but in unsupervised training, after the training is almost completed if have this error pop up, Traceback (most recent call last): File "unsupervised.py", line 139, in <module> evaluator.all_eval(to_log) File "/home/jay/lib/MUSE/src/evaluation/evaluator.py", line 217, in all_eval self.word_translation(to_log) File "/home/jay/lib/MUSE/src/evaluation/evaluator.py", line 120, in word_translation dico_eval=self.params.dico_eval File "/home/jay/lib/MUSE/src/evaluation/word_translation.py", line 93, in get_word_translation_accuracy dico = load_dictionary(path, word2id1, word2id2) File "/home/jay/lib/MUSE/src/evaluation/word_translation.py", line 49, in load_dictionary assert os.path.isfile(path) AssertionError

jdvala commented 6 years ago

Maybe a detailed documentation is needed. Describing all the the necessary details

glample commented 6 years ago

Can you provide the command you used to run the code?

If you look at the traceback it tells you that the model is not able to find the dictionary path. Which can be because of two things: 1) you didn't download the required data, in that case you can do it by going to ./data/ and running ./get_evaluation.sh OR 2) you are using a language pair for which we do not provide an evaluation dictionary (typically if it's a relatively low-resource language). in that case you will have to build an evaluation dictionary to evaluate the quality of your cross-lingual embeddings, or to simply disable the evaluation by commenting out the line self.word_translation(to_log)

dpy011 commented 4 years ago

self.word_translation(to_log) If anyone is wondering where exactly is this line located(as I was :)). It is in src/evaluation/evaluator.py at line no. 217