Open kr-sundaram opened 4 years ago
There are some parameters to be set before training, as file multi_gpu_train.sh
and train.sh
show. If you want to use single computer and multiple GPUs to train this model, you can run following command:
bash multi_gpu_train.sh
If you want to use single computer and single GPU for training, please make sure your current working directory is $your_path/NMT/
and run following command:
bash trainer/train.sh
Here is an example command with some descriptions of each parameter which can be used for multi GPU training.
python -m torch.distributed.launch --nproc_per_node=3 multi_gpu_train.py \
--device_id 1 2 3 \
# device_id is the id of GPU that used for training
--src_language combine \
# name of source language
--tgt_language en \
# name of target language
--src_path /data/rrjin/corpus_data/lang_vec_data/bible-corpus/train_data/train_src_combine_bpe_32000.txt \
# location of corpus of source language
--tgt_path /data/rrjin/corpus_data/lang_vec_data/bible-corpus/train_data/train_tgt_en_bpe_32000.txt \
# location of corpus of target language
--src_vocab_path /data/rrjin/NMT/data/src_combine_32000.vocab \
# location where the vocabulary of source language stores, the vocabulary will be automaticly generated according to the corpus
--tgt_vocab_path /data/rrjin/NMT/data/tgt_en_32000.vocab \
# location where the vocabulary of target language stores, the vocabulary will be automaticly generated according to the corpus
--rnn_type lstm \
# the kind of RNN used in encoder and decoder, it can be "rnn" or "gru" or "lstm"
--embedding_size 512 \
# size of word embedding
--hidden_size 512 \
# hidden size of RNN in the encoder
--num_layers 3 \
# number of layers of RNN in encoder and decoder. for example, if ${num_layers} is 3, the encoder and decoder have 3 recurrent layers separately
--checkpoint /data/rrjin/NMT/data/models/basic_multi_gpu_lstm \
# prefix of the path where the model after training saves
--batch_size 32 \
# number of sentence to be processing per training step
--dropout 0.2 \
# probability of an element to be zeroed in RNN
--rebuild_vocab \
# build vocabulary for corpus
--normalize
# preprocess the sentence, please refer `normalizeString` function in `NMT/utils/process.py` for more details
After training, you can run eval.py
or quick_eval.py
for translating (just run eval.sh
or quick_eval.sh
for simplicity). The difference between eval.py
and quick_eval.py
is that beam search is used in eval.py
when decoding and greedy decoding is used in quick_eval.py
Thank you so much for your help! I will let you know if need anything.
Thanks for making the repo public. I am new to the machine translation and your repo seems promising to me. could you please explain about how to go about train and evaluate the model for my datasets.