tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.35k stars 1.96k forks source link

how to train using gpu #467

Open longsamaa opened 4 years ago

longsamaa commented 4 years ago

Hi, I ran the sample code from the site, i.e., the first NMT model, translating from Vietnamese to English. https://github.com/tensorflow/nmt

I use the command : python -m nmt.nmt \ --src=vi --tgt=en \ --vocab_prefix=/tmp/nmt_data/vocab \ --train_prefix=/tmp/nmt_data/train \ --dev_prefix=/tmp/nmt_data/tst2012 \ --test_prefix=/tmp/nmt_data/tst2013 \ --out_dir=/tmp/nmt_model \ --num_train_steps=12000 \ --steps_per_stats=100 \ --num_layers=2 \ --num_units=128 \ --dropout=0.2 \ --metrics=bleu

Here is the result: # Init train iterator, skipping 0 elements step 100 lr 1 step-time 1.30s wps 4.27K ppl 1619.59 gN 13.92 bleu 0.00, Tue Apr 7 16:21:33 2020 step 200 lr 1 step-time 1.35s wps 4.20K ppl 598.36 gN 7.14 bleu 0.00, Tue Apr 7 16:23:48 2020 step 300 lr 1 step-time 1.28s wps 4.38K ppl 365.48 gN 5.09 bleu 0.00, Tue Apr 7 16:25:56 2020 step 400 lr 1 step-time 1.24s wps 4.51K ppl 274.08 gN 4.18 bleu 0.00, Tue Apr 7 16:28:01 2020 step 500 lr 1 step-time 1.25s wps 4.52K ppl 225.03 gN 3.72 bleu 0.00, Tue Apr 7 16:30:06 2020 step 600 lr 1 step-time 1.13s wps 4.96K ppl 197.01 gN 3.29 bleu 0.00, Tue Apr 7 16:31:59 2020 step 700 lr 1 step-time 1.10s wps 5.10K ppl 181.90 gN 3.23 bleu 0.00, Tue Apr 7 16:33:49 2020 step 800 lr 1 step-time 1.08s wps 5.30K ppl 171.58 gN 3.20 bleu 0.00, Tue Apr 7 16:35:37 2020 step 900 lr 1 step-time 1.01s wps 5.53K ppl 158.98 gN 3.22 bleu 0.00, Tue Apr 7 16:37:17 2020 step 1000 lr 1 step-time 0.90s wps 6.10K ppl 146.85 gN 3.22 bleu 0.00, Tue Apr 7 16:38:48 2020

During training, my gpu used 0% and my cpu used up to 80%. So the training is very slow. How can I fix this? I use gpu is 2060 thank you!

nashid commented 4 years ago

@longsamaa are you able to triage this issue?

luozhouyang commented 4 years ago

add argument --nums_gpu=1. https://github.com/tensorflow/nmt/blob/0be864257a76c151eef20ea689755f08bc1faf4e/nmt/nmt.py#L231