tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.39k stars 1.96k forks source link

Re-train by using the ckeckpoint #203

Open Marilena263 opened 6 years ago

Marilena263 commented 6 years ago

Hi, I have a question. Right now when we train the model the weights get initialized to the same value (which is 0.1 by default). The question is how can we train the model, so that it initializes the weights using the values from the checkpoint? (--ckpt=/path/to/checkpoint/translate.ckpt) - In other words how do I fine-tune the model for a new dataset.

oahziur commented 6 years ago

@Marilena263

If you want to use the --ckpt for training, you need to make a small change here. You may want to use the load_model method for loading from --ckpt directly.

d2sys commented 6 years ago

Hi, If I stop my training and re-run from the latest checkpoint the bleu score for dev/test sets is increased. What is the reason for that? Should I update the random_seed parameter after each epoch?

nbro commented 6 years ago

As far as I have understood, loading the parameters from the last checkpoint is now the default behavior.

Sabyasachi18 commented 6 years ago

Hi, I have a question regarding model re-training. I want to run incremental training on my trained German-English Engine using NMT with subword BPE encoding. Can I update my vocab file with new words from the incremental training data. If Yes, then kindly let me know the process.

Should I append the new words at the end of the existing vocabulary file while running incremental training? Or should i do a sorting of the vocab file after appending the new words to it?

yapingzhao commented 6 years ago

Hi, There is a problem: Re-train with --ckpt directly, the running command is python -m nmt.nmt --ckpt directly? Thank you very much!

ArashHosseini commented 6 years ago

@yapingzhao use --ckpt flag like --ckpt=/path/to/last/saved/ckpt . I also had to reset the num_train_steps in hparams. please note the --num_keep_ckpts, can be important for re-train

kbv71 commented 5 years ago

@Sabyasachi18 I was trying similar approach as yours. Were you able to work it out? If yes, please share your process.

sahertariq07 commented 5 years ago

I'm using tensorflow 1.0.1. I'm new in deep learning and following tensorflow nmt tutorial. how can i re run training from last saved checkpoints. Is it default in tensorflow 1.0.1. Or should i have to give it last saved ckpt path to re run. Kindly guide me. Thank you.

ArashHosseini commented 5 years ago

Hi @sahertariq07, use --ckpt flag like --ckpt=/path/to/last/saved/ckpt explicitly

sahertariq07 commented 5 years ago

@ArashHosseini thank you very much for your response....

sahertariq07 commented 5 years ago

@ArashHosseini thank you very much for your response....

ArashHosseini commented 5 years ago

@sahertariq07, welcome