dalab / deep-ed

Source code for the EMNLP'17 paper "Deep Joint Entity Disambiguation with Local Neural Attention", https://arxiv.org/abs/1704.04920
Apache License 2.0
223 stars 50 forks source link

How to run step14: trining entity embeddings? #20

Closed izuna385 closed 5 years ago

izuna385 commented 5 years ago

Thanks for uploading code.

I did step 1 to step 13, and in step 13, I confirmed there were files mentiond in the README.md.

But when I run step 14, that is, CUDA_VISIBLE_DEVICES=0 th entities/learn_e2v/learn_a.lua -root_data_dir $DATA_PATH |& tee log_train_entity_vecs

there only shows up train help and error message.


missing argument for option -root_data_dir
Usage: [options]

Learning entity vectors

Options:
  -type                  Type: double | float | cuda | cudacudnn [cudacudnn]
  -root_data_dir         Root path of the data, $DATA_PATH. []
  -optimization          Optimization method: RMSPROP | ADAGRAD | ADAM | SGD [ADAGRAD]
  -lr                    Learning rate [0.3]
  -batch_size            Mini-batch size (1 = pure stochastic) [500]
  -word_vecs             300d word vectors type: glove | w2v [w2v]
  -num_words_per_ent     Num positive words sampled for the given entity at each iteration. [20]
  -num_neg_words         Num negative words sampled for each positive word. [5]
  -unig_power            Negative sampling unigram power (0.75 used in Word2Vec). [0.6]
  -entities              Set of entities for which we train embeddings: 4EX (tiny, for debug) | RLTD (restricted set) | ALL (all Wiki entities, too big to fit on a single GPU) [RLTD]
  -init_vecs_title_words whether the entity embeddings should be initialized with the average of title word embeddings. Helps to speed up convergence speed of entity embeddings learning. [true]
  -loss                  Loss function: nce (noise contrastive estimation) | neg (negative sampling) | is (importance sampling) | maxm (max-margin) [maxm]
  -data                  Training data: wiki-canonical (only) | wiki-canonical-hyperlinks [wiki-canonical-hyperlinks]
  -num_passes_wiki_words Num passes (per entity) over Wiki canonical pages before changing to using Wiki hyperlinks. [200]
  -hyp_ctxt_len          Left and right context window length for hyperlinks. [10]

I tried to fix missing argument for option -root_data_dir, but I couldn't.

I would appreciate if you would know how to fix it.

Thanks.

izuna385 commented 5 years ago

I fixed it by CUDA_VISIBLE_DEVICES=0 th entities/learn_e2v/learn_a.lua -root_data_dir dataset_absolute_path |& tee log_train_entity_vecs Sorry for very basic quesiton,

Thanks