Closed bdqnghi closed 6 years ago
Can you have a look at https://github.com/facebookresearch/MUSE/issues/31 and see if that helps? In particular, setting --dico_max_rank 1500
as a parameter. The idea is that the model tries to build a dictionary for the top 10k most frequent words, but if you have only 4k of them then it will raise an error.
Also, if you do not have any parallel dictionary you should indeed disable the word translation evaluation by simply commenting out this line: https://github.com/facebookresearch/MUSE/blob/master/src/evaluation/evaluator.py#L190 But you probably want to find a small dictionary before you start experiments, otherwise it will be very difficult for you to know whether your model is working or not.
Thanks for the reply, it works but i think there is a minor bug in the code. Here I'm trying to print the whole "params" variable, then I got this :
Namespace(adversarial=True, batch_size=32, cuda=True, dico_build='S2T', dico_max_rank=10000, dico_max_size=10000, dico_method='nn', dico_min_size=0, dico_threshold=0, dis_clip_weights=0, dis_dropout=0.0, dis_hid_dim=2048, dis_input_dropout=0.1, dis_lambda=1, dis_layers=2, dis_most_frequent=50, dis_optimizer='sgd,lr=0.1', dis_smooth=0.1, dis_steps=5, emb_dim=30, epoch_size=4000, exp_name='debug', exp_path='/home/nghibui/codes/MUSE/dumped/debug/egzngeby1i', export='txt', lr_decay=0.98, lr_shrink=0.5, map_beta=0.001, map_id_init=True, map_optimizer='sgd,lr=0.1', max_vocab=200000, min_lr=1e-06, n_epochs=5, n_refinement=5, normalize_embeddings='', seed=-1, src_dico=<src.dictionary.Dictionary object at 0x7f8087dac588>, src_emb='data/cpp_vectors_30D_15_ae_train_trees.txt', src_lang='cpp', src_mean=None, tgt_dico=<src.dictionary.Dictionary object at 0x7f8087dac4e0>, tgt_emb='data/java_vectors_30D_15_ae_train_trees.txt', tgt_lang='java', tgt_mean=None, verbose=2)
10000
0 10000 128
128 10000 128
256 10000 128
384 10000 128
512 10000 128
I definitely put the setting --dico_max_rank 1500 into the command, but it's still using the 10000, I have to hard coded that line to get dico_max_rank = 1500 to make it works.
It's also strange that in the Initial logger, the dico_max_rank=1500 already, but the number 10000 is still used:
INFO - 04/15/18 20:44:41 - 0:00:00 - adversarial: True
batch_size: 32
cuda: True
dico_build: S2T&T2S
dico_max_rank: 1500
dico_max_size: 100
dico_method: csls_knn_10
dico_min_size: 0
dico_threshold: 0
dis_clip_weights: 0
dis_dropout: 0.0
dis_hid_dim: 2048
dis_input_dropout: 0.1
dis_lambda: 1
dis_layers: 2
dis_most_frequent: 50
dis_optimizer: sgd,lr=0.1
dis_smooth: 0.1
dis_steps: 5
emb_dim: 30
epoch_size: 1000000
exp_name: debug
exp_path: /home/nghibui/codes/MUSE/dumped/debug/nucxafkaic
export: txt
lr_decay: 0.98
lr_shrink: 0.5
map_beta: 0.001
map_id_init: True
map_optimizer: sgd,lr=0.1
max_vocab: 200000
min_lr: 1e-06
n_epochs: 1
n_refinement: 5
normalize_embeddings:
seed: -1
src_emb: data/cpp_vectors_30D_15_ae_train_trees.txt
src_lang: cpp
tgt_emb: data/java_vectors_30D_15_ae_train_trees.txt
tgt_lang: java
verbose: 2
Ok i think this line is the reason https://github.com/facebookresearch/MUSE/blob/master/src/evaluation/evaluator.py#L163 .
Is this a bug or I am doing something wrong?
Also, after finish the discriminator training, I get error in the refinement step:
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
INFO - 04/15/18 20:57:41 - 0:00:03 - New train dictionary of 1500 pairs.
INFO - 04/15/18 20:57:41 - 0:00:03 - Mean cosine (nn method, S2T build, 10000 max size): 0.99295
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
INFO - 04/15/18 20:57:41 - 0:00:03 - New train dictionary of 1500 pairs.
INFO - 04/15/18 20:57:41 - 0:00:03 - Mean cosine (csls_knn_10 method, S2T build, 10000 max size): 0.99293
INFO - 04/15/18 20:57:41 - 0:00:03 - Discriminator source / target predictions: 0.31108 / 0.28520
INFO - 04/15/18 20:57:41 - 0:00:03 - Discriminator source / target / global accuracy: 0.00376 / 1.00000 / 0.30379
INFO - 04/15/18 20:57:41 - 0:00:03 - __log__:{"n_epoch": 0, "precision_at_1-nn": 0.0, "precision_at_5-nn": 0.0, "precision_at_10-nn": 0.0, "precision_at_1-csls_knn_10": 0.0, "precision_at_5-csls_knn_10": 0.0, "precision_at_10-csls_knn_10": 0.0, "mean_cosine-nn-S2T-10000": 0.9929460287094116, "mean_cosine-csls_knn_10-S2T-10000": 0.9929250478744507, "dis_accu": 0.30378963650425367, "dis_src_pred": 0.3110812323752522, "dis_tgt_pred": 0.2851960619994782}
INFO - 04/15/18 20:57:41 - 0:00:03 - * Best value for "mean_cosine-csls_knn_10-S2T-10000": 0.99293
INFO - 04/15/18 20:57:41 - 0:00:03 - * Saving the mapping to /home/nghibui/codes/MUSE/dumped/debug/6c30aoum1t/best_mapping.pth ...
INFO - 04/15/18 20:57:41 - 0:00:03 - End of epoch 0.
INFO - 04/15/18 20:57:41 - 0:00:03 - Decreasing learning rate: 0.10000000 -> 0.09800000
INFO - 04/15/18 20:57:41 - 0:00:03 - ----> ITERATIVE PROCRUSTES REFINEMENT <----
INFO - 04/15/18 20:57:41 - 0:00:03 - * Reloading the best model from /home/nghibui/codes/MUSE/dumped/debug/6c30aoum1t/best_mapping.pth ...
INFO - 04/15/18 20:57:41 - 0:00:03 - Starting refinement iteration 0...
INFO - 04/15/18 20:57:41 - 0:00:03 - Building the train dictionary ...
WARNING - 04/15/18 20:57:41 - 0:00:03 - Empty intersection ...
Traceback (most recent call last):
File "unsupervised.py", line 168, in <module>
trainer.procrustes()
File "/home/nghibui/codes/MUSE/src/trainer.py", line 174, in procrustes
A = self.src_emb.weight.data[self.dico[:, 0]]
TypeError: 'NoneType' object is not subscriptable
Any explanation for this? Thanks
Yes you are right, the 10000 in the code is an issue we need to fix when the vocabulary size is smaller, sorry about that.
Regarding your error, the dico_build
method is set to dico_build: S2T&T2S
which means that the model will generate a source -> target dictionary and a target -> source dictionary, and take the intersection of both to get reliable translation pairs. In your case, the WARNING Empty intersection
means that the intersection was empty which means that the alignment totally failed. You can set dico_build
to --dico_build S2T
to avoid that, but usually when the intersection is empty it means that the alignment is bad.
Also, did you really do the discriminator training? Your error happens after 3 seconds of running which is not enough for the discriminator to run.
Yes, I reduce the epoch size into just 5000 for a quick testing instead of waiting so long to get the final result and an error appears :), that's why it only took 3 seconds
And thanks for the answer, it works for me.
I was trying to run the unsupervised mapping task and got this error:
No idea why this is happened, any explain? I'm doing the mapping task on a pair of languages that does not belong to the available list, because of that, I don't have a dictionary for evaluation. Also, the number of vocabularies for each language is quite small, around 4000 each. I guess if I can get rid of the evaluation tasks in the evaluator.py, the code will work.