facebookresearch / MUSE

A library for Multilingual Unsupervised or Supervised word Embeddings
Other
3.19k stars 552 forks source link

ValueError: result of slicing is an empty tensor #39

Closed bdqnghi closed 6 years ago

bdqnghi commented 6 years ago

I was trying to run the unsupervised mapping task and got this error:

INFO - 04/15/18 03:50:23 - 0:00:06 - 9 source words - csls_knn_10 - Precision at k = 10: 0.000000
Traceback (most recent call last):
  File "unsupervised.py", line 136, in <module>
    evaluator.all_eval(to_log)
  File "/home/nghibui/codes/MUSE/src/evaluation/evaluator.py", line 192, in all_eval
    self.dist_mean_cosine(to_log)
  File "/home/nghibui/codes/MUSE/src/evaluation/evaluator.py", line 172, in dist_mean_cosine
    s2t_candidates = get_candidates(src_emb, tgt_emb, _params)
  File "/home/nghibui/codes/MUSE/src/dico_builder.py", line 38, in get_candidates
    scores = emb2.mm(emb1[i:min(n_src, i + bs)].transpose(0, 1)).transpose(0, 1)
ValueError: result of slicing is an empty tensor

No idea why this is happened, any explain? I'm doing the mapping task on a pair of languages that does not belong to the available list, because of that, I don't have a dictionary for evaluation. Also, the number of vocabularies for each language is quite small, around 4000 each. I guess if I can get rid of the evaluation tasks in the evaluator.py, the code will work.

glample commented 6 years ago

Can you have a look at https://github.com/facebookresearch/MUSE/issues/31 and see if that helps? In particular, setting --dico_max_rank 1500 as a parameter. The idea is that the model tries to build a dictionary for the top 10k most frequent words, but if you have only 4k of them then it will raise an error.

Also, if you do not have any parallel dictionary you should indeed disable the word translation evaluation by simply commenting out this line: https://github.com/facebookresearch/MUSE/blob/master/src/evaluation/evaluator.py#L190 But you probably want to find a small dictionary before you start experiments, otherwise it will be very difficult for you to know whether your model is working or not.

bdqnghi commented 6 years ago

Thanks for the reply, it works but i think there is a minor bug in the code. Here I'm trying to print the whole "params" variable, then I got this :

Namespace(adversarial=True, batch_size=32, cuda=True, dico_build='S2T', dico_max_rank=10000, dico_max_size=10000, dico_method='nn', dico_min_size=0, dico_threshold=0, dis_clip_weights=0, dis_dropout=0.0, dis_hid_dim=2048, dis_input_dropout=0.1, dis_lambda=1, dis_layers=2, dis_most_frequent=50, dis_optimizer='sgd,lr=0.1', dis_smooth=0.1, dis_steps=5, emb_dim=30, epoch_size=4000, exp_name='debug', exp_path='/home/nghibui/codes/MUSE/dumped/debug/egzngeby1i', export='txt', lr_decay=0.98, lr_shrink=0.5, map_beta=0.001, map_id_init=True, map_optimizer='sgd,lr=0.1', max_vocab=200000, min_lr=1e-06, n_epochs=5, n_refinement=5, normalize_embeddings='', seed=-1, src_dico=<src.dictionary.Dictionary object at 0x7f8087dac588>, src_emb='data/cpp_vectors_30D_15_ae_train_trees.txt', src_lang='cpp', src_mean=None, tgt_dico=<src.dictionary.Dictionary object at 0x7f8087dac4e0>, tgt_emb='data/java_vectors_30D_15_ae_train_trees.txt', tgt_lang='java', tgt_mean=None, verbose=2)
10000
0 10000 128
128 10000 128
256 10000 128
384 10000 128
512 10000 128

I definitely put the setting --dico_max_rank 1500 into the command, but it's still using the 10000, I have to hard coded that line to get dico_max_rank = 1500 to make it works.

bdqnghi commented 6 years ago

It's also strange that in the Initial logger, the dico_max_rank=1500 already, but the number 10000 is still used:

INFO - 04/15/18 20:44:41 - 0:00:00 - adversarial: True
                                     batch_size: 32
                                     cuda: True
                                     dico_build: S2T&T2S
                                     dico_max_rank: 1500
                                     dico_max_size: 100
                                     dico_method: csls_knn_10
                                     dico_min_size: 0
                                     dico_threshold: 0
                                     dis_clip_weights: 0
                                     dis_dropout: 0.0
                                     dis_hid_dim: 2048
                                     dis_input_dropout: 0.1
                                     dis_lambda: 1
                                     dis_layers: 2
                                     dis_most_frequent: 50
                                     dis_optimizer: sgd,lr=0.1
                                     dis_smooth: 0.1
                                     dis_steps: 5
                                     emb_dim: 30
                                     epoch_size: 1000000
                                     exp_name: debug
                                     exp_path: /home/nghibui/codes/MUSE/dumped/debug/nucxafkaic
                                     export: txt
                                     lr_decay: 0.98
                                     lr_shrink: 0.5
                                     map_beta: 0.001
                                     map_id_init: True
                                     map_optimizer: sgd,lr=0.1
                                     max_vocab: 200000
                                     min_lr: 1e-06
                                     n_epochs: 1
                                     n_refinement: 5
                                     normalize_embeddings: 
                                     seed: -1
                                     src_emb: data/cpp_vectors_30D_15_ae_train_trees.txt
                                     src_lang: cpp
                                     tgt_emb: data/java_vectors_30D_15_ae_train_trees.txt
                                     tgt_lang: java
                                     verbose: 2
bdqnghi commented 6 years ago

Ok i think this line is the reason https://github.com/facebookresearch/MUSE/blob/master/src/evaluation/evaluator.py#L163 .

Is this a bug or I am doing something wrong?

bdqnghi commented 6 years ago

Also, after finish the discriminator training, I get error in the refinement step:

INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
INFO - 04/15/18 20:57:41 - 0:00:03 - New train dictionary of 1500 pairs.
INFO - 04/15/18 20:57:41 - 0:00:03 - Mean cosine (nn method, S2T build, 10000 max size): 0.99295
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
INFO - 04/15/18 20:57:41 - 0:00:03 - New train dictionary of 1500 pairs.
INFO - 04/15/18 20:57:41 - 0:00:03 - Mean cosine (csls_knn_10 method, S2T build, 10000 max size): 0.99293
INFO - 04/15/18 20:57:41 - 0:00:03 - Discriminator source / target predictions: 0.31108 / 0.28520
INFO - 04/15/18 20:57:41 - 0:00:03 - Discriminator source / target / global accuracy: 0.00376 / 1.00000 / 0.30379
INFO - 04/15/18 20:57:41 - 0:00:03 - __log__:{"n_epoch": 0, "precision_at_1-nn": 0.0, "precision_at_5-nn": 0.0, "precision_at_10-nn": 0.0, "precision_at_1-csls_knn_10": 0.0, "precision_at_5-csls_knn_10": 0.0, "precision_at_10-csls_knn_10": 0.0, "mean_cosine-nn-S2T-10000": 0.9929460287094116, "mean_cosine-csls_knn_10-S2T-10000": 0.9929250478744507, "dis_accu": 0.30378963650425367, "dis_src_pred": 0.3110812323752522, "dis_tgt_pred": 0.2851960619994782}
INFO - 04/15/18 20:57:41 - 0:00:03 - * Best value for "mean_cosine-csls_knn_10-S2T-10000": 0.99293
INFO - 04/15/18 20:57:41 - 0:00:03 - * Saving the mapping to /home/nghibui/codes/MUSE/dumped/debug/6c30aoum1t/best_mapping.pth ...
INFO - 04/15/18 20:57:41 - 0:00:03 - End of epoch 0.

INFO - 04/15/18 20:57:41 - 0:00:03 - Decreasing learning rate: 0.10000000 -> 0.09800000
INFO - 04/15/18 20:57:41 - 0:00:03 - ----> ITERATIVE PROCRUSTES REFINEMENT <----

INFO - 04/15/18 20:57:41 - 0:00:03 - * Reloading the best model from /home/nghibui/codes/MUSE/dumped/debug/6c30aoum1t/best_mapping.pth ...
INFO - 04/15/18 20:57:41 - 0:00:03 - Starting refinement iteration 0...
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
WARNING - 04/15/18 20:57:41 - 0:00:03 - Empty intersection ...
Traceback (most recent call last):
  File "unsupervised.py", line 168, in <module>
    trainer.procrustes()
  File "/home/nghibui/codes/MUSE/src/trainer.py", line 174, in procrustes
    A = self.src_emb.weight.data[self.dico[:, 0]]
TypeError: 'NoneType' object is not subscriptable

Any explanation for this? Thanks

glample commented 6 years ago

Yes you are right, the 10000 in the code is an issue we need to fix when the vocabulary size is smaller, sorry about that.

Regarding your error, the dico_build method is set to dico_build: S2T&T2S which means that the model will generate a source -> target dictionary and a target -> source dictionary, and take the intersection of both to get reliable translation pairs. In your case, the WARNING Empty intersection means that the intersection was empty which means that the alignment totally failed. You can set dico_build to --dico_build S2T to avoid that, but usually when the intersection is empty it means that the alignment is bad.

Also, did you really do the discriminator training? Your error happens after 3 seconds of running which is not enough for the discriminator to run.

bdqnghi commented 6 years ago

Yes, I reduce the epoch size into just 5000 for a quick testing instead of waiting so long to get the final result and an error appears :), that's why it only took 3 seconds

And thanks for the answer, it works for me.