Open skyw opened 7 years ago
Thanks, I will need to update the nmt/standard_hparams/wmt16_en_de_gnmt.json
.
I am also adding instructions on how to train and load the gnmt model from scratch.
I ran a standard attention / scaled_luong / uni system and go the expected results. Same with gnmt architecture / scaled_luong / enc_type gnmt, completely off. Is there something special to do for GNMT attention architecture ?
@vince62s Did you check with the standard_hparams for GNMT, there are also pre-trained models available for download in the README page.
Same problem here. After training, when doing inference I get:
KeyError: 'num_encoder_residual_layers'
It only works when I delete all these keys from the hparams file, and when I set the --hparams_path to the directory of the best_bleu, but then after one run, for some reason, it rewrites the hparams file, and add these problematic key/values again... It's not clear how this mechanism works.
My guess is that when the code is saving hparams, it simply writes key values that it doesn't suppose to.
@nadavb can you share the command getting the error? were you using the standard_hparams file in the repo for inference?
There are some updates to the hparams recently, so I think the standard_hparams maybe out of date.
@oahziur I did not use the standard hparams. I used the params as shown in the tutorial.
So for training:
--attention=scaled_luong \
--src=vi --tgt=en \
--vocab_prefix=tmp/nmt_data/vocab \
--train_prefix=tmp/nmt_data/train \
--dev_prefix=tmp/nmt_data/tst2012 \
--test_prefix=tmp/nmt_data/tst2013 \
--out_dir=/tmp/nmt_attention_model \
--num_train_steps=5000 \
--steps_per_stats=20 \
--num_layers=2 \
--num_units=128 \
--dropout=0.2 \
--metrics=bleu
And for inference:
python nmt/nmt.py \
--out_dir=/tmp/nmt_attention_model \
--inference_input_file=/tmp/nmt_data/source_infer.vi \
--inference_output_file=/tmp/nmt_attention_model/output_infer
@nadavb Hello. i"m studying nmt. i want to run test file. so i ran just. nmt,py but failed.
how to command your script?? please let me know basiclly
@LimWoohyun Look at https://github.com/tensorflow/nmt -> search for "Hands-on – building an attention-based NMT model" the command is written there.
@oahziur I get Key error using the standard_hparams (tf 1.6rc1, will try on my other machine with tf1.5-cuda).
NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
using the command:
[bquast@UX370UA ~]$ cd nmt
[bquast@UX370UA nmt]$ python -m nmt.nmt \
> --src=de --tgt=en \
> --ckpt=deen_gnmt_model_4_layer/translate.ckpt \
> --hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
> --out_dir=/tmp/deen_gnmt \
> --vocab_prefix=/home/bquast/en_de_data/vocab.bpe.32000 \
> --inference_input_file=/home/bquast/en_de_data/newstest2014.tok.bpe.32000.de \
> --inference_output_file=/home/bquast/deen_gnmt_model_4_layer/output_infer \
full output here:
https://gist.github.com/bquast/30ba7630d2bf32b59dd8349889fc7638
EDIT: confirmed, same error on tf15.-cuda
https://gist.github.com/bquast/0ddbf8eda363d312dd57b51aebb11f5d
@bquast I recently got the error using the same configuration.
Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]
I tried this with tf14 too, but no luck. Are there any updates on this?
Thank you.
hey, no news yet, any progress on your side?
@bquast I think this is related to https://github.com/tensorflow/nmt/issues/264 and there is a PR fixed this https://github.com/tensorflow/nmt/pull/265. Maybe you can try patch the PR and see if you still get the issue. Make sure you clear the model directory.
@bquast @tiberiu92 @oahziur
I got the same error using the same configuration. (tf-1.8, python-2.7)
python -m nmt.nmt \
--src=de --tgt=en \
--ckpt=/home/xiaohao/nmt/models/deen_gnmt_model_4_layer/translate.ckpt \
--hparams_path=nmt/standard_hparams/wmt16_gnmt_4_layer.json \
--out_dir=/home/xiaohao/data/deen_gnmt \
--vocab_prefix=/home/xiaohao/data/wmt16/vocab.bpe.32000 \
--inference_input_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.de \
--inference_output_file=/home/xiaohao/data/deen_gnmt/output_infer \
--inference_ref_file=/home/xiaohao/data/wmt16/newstest2015.tok.bpe.32000.en
NotFoundError (see above for traceback): Key dynamic_seq2seq/encoder/rnn/basic_lstm_cell/bias not found in checkpoint
I print keys of deen_gnmt_model_4_layer/translate.ckpt
,not find .../rnn/basic_lstm_cell/bias
xiaohao@ubuntu:~/nmt$ python ckpt_print.py models/deen_gnmt_model_4_layer/translate.ckpt
('CHECKPOINT_FILE: ', 'models/deen_gnmt_model_4_layer/translate.ckpt')
('tensor_name: ', 'embeddings/encoder/embedding_encoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/memory_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_3/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/output_projection/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/query_layer/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_v')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_0/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_b')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/bahdanau_attention/attention_g')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'Variable')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_0_attention/attention/basic_lstm_cell/bias')
('tensor_name: ', 'embeddings/decoder/embedding_decoder')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/bw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/bidirectional_rnn/fw/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_2/basic_lstm_cell/bias')
('tensor_name: ', 'dynamic_seq2seq/decoder/multi_rnn_cell/cell_2/basic_lstm_cell/kernel')
('tensor_name: ', 'dynamic_seq2seq/encoder/rnn/multi_rnn_cell/cell_1/basic_lstm_cell/kernel')
xiaohao@ubuntu:~/nmt$
I try the PR(#265), and rm -rf /home/xiaohao/data/deen_gnmt/*
. The problem is sloved!
tks~ @oahziur
It complains a key error "KeyError: num_residual_layers"
Here is my script
python -m nmt.nmt \ --src=en --tgt=de \ --vocab_prefix=${DATA_DIR}/vocab \ --train_prefix=${DATA_DIR}/train \ --dev_prefix=${DATA_DIR}/newstest2014 \ --test_prefix=${DATA_DIR}/newstest2015 \ --out_dir=$(OUT_DIR}/test \ --hparams_path nmt/standard_hparams/wmt16_en_de_gnmt.json