OOM error: Reduced batch size to 32/64 yet getting batch_size=128 in hparams

tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial

Apache License 2.0

6.36k stars 1.96k forks source link

OOM error: Reduced batch size to 32/64 yet getting batch_size=128 in hparams #386

Closed ghost closed 5 years ago

ghost commented 5 years ago

I have tried both by reducing batch_size default to 64(line 201) on nmt.py as well as passing flag batch_size=64 . In both cases, I am still getting batch_size=128 in my hparams file. What am I doing wrong?

ranjita-naik commented 5 years ago

Have you tried changing it through --batch_size command line argument? It worked for me. I could see "batch_size": 64 , in hparams.

python -m nmt.nmt --src=vi --tgt=en --vocab_prefix=/tmp/nmt_data/vocab --train_prefix=/tmp/nmt_data/train --dev_prefix=/tmp/nmt_data/tst2012 --test_prefix=/tmp/nmt_data/tst2013 --out_dir=/tmp/nmt_model --num_train_steps=12000 --steps_per_stats=100 --num_layers=2 --num_units=128 --dropout=0.2 --metrics=bleu --batch_size=64

ghost commented 5 years ago

Oh yes. The issue is resolved. You need to either mention a new out_dir or erase the previous model contents of the out_dir before running a fresh new batch.