Closed 520jefferson closed 6 years ago
The master branch of Nematus doesn't currently have multi-GPU support, but there is some experimental code that may be merged in soon.
@rsennrich when i train with nematus ,if i kill the training procedure according to the PID of GPU, (i set reload is True) then i start the traing procedure again ,whether the training rate will be lower?i have some feeling about this.
If you don't change the configuration, Nematus will continue training with the same learning rate. Even if you use an optimizer with adaptive learning rates, such as adam, Nematus stores the information that is necessary to continue with the same learning rates (in model.npz.gradinfo.npz ).
@rsennrich I have two question. First,when decoding i use --device-list gpu0 gpu1 gpu2 gpu3 gpu4 ,but it actually use gpu0. Second, it take so much time to decode using translate.py
Attach my decode setting as follows: export THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python translate.py \ -k 11\ -p 1\ -n 1\ --models $MODEL/model.iter468000.npz\ -i /home/mt-srcb/work/nmt/ai_raml/test/valid.en\ -o /home/mt-srcb/work/nmt/ai_raml/test/ai.valid.out\ --device-list gpu0 gpu1 gpu2 gpu3 gpu4
My training setting as follows: --layer_normalisation\ --tie_decoder_embeddings \ --enc_depth 4 \ --dec_depth 4 \ --dec_deep_context \ --enc_recurrence_transition_depth 2 \ --dec_base_recurrence_transition_depth 4 \ --dec_high_recurrence_transition_depth 2 \
Hi,
Use a number higher than 1 for the cmd option -p
. Each of those processes will then bind to a different GPU device if 1) you used --device-list
to specify several devices and 2) several are available on your system.
Side note, you could install and use the new gpuarray backend for theano, then use device=cuda
in your flags. See https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29 and http://deeplearning.net/software/libgpuarray/installation.html.
Finally, for fast decoding, people have used Marian which is compatible with some Nematus models. Perhaps worthwhile for you to try, if performance is a priority: https://marian-nmt.github.io/.
@bricksdont
marian ,you mean type s2s could decode deep nematus?
I did not try this myself, but at least deep models are listed as a feature of S2S: https://marian-nmt.github.io/features/. But you seem to have some experience with Marian development yourself, did you try this already?
Thanks, i will try ,i already validate the effect of marian (type --amun) in my corpus (nearly same as nematus,even a bitter high in no deep mode), then i will try the type of s2s.
i train the model in k40 with gpu0-4,but the rate is about '40 sents/s' every piece of card。the rate is so low comparing to the training process with single card。
batch_size=128 maxlen=50