tensorflow / nmt

TensorFlow Neural Machine Translation Tutorial
Apache License 2.0
6.39k stars 1.96k forks source link

Change vocabulary size during re-run #134

Open kmario23 opened 7 years ago

kmario23 commented 7 years ago

Hello, With a vocabulary size of 55K, I have trained the model for 200K steps and saved the latest checkpoint. Now, I increased my vocabulary size to 70K.

  1. How can I continue training from the saved checkpoint but with this new vocabulary size (70K)? Simply changing the vocabulary size throws error: InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match
  2. For model fine-tuning, does updating the vocabulary and continue training from previous best checkpoint is the optimal strategy? Or is there another optimal way to fine-tune the pre-trained model?

The issue 51 has no working solutions yet.

oahziur commented 7 years ago

Hi @kmario23,

Is the 55K vocab a subset of the 70K? If so, maybe you can just start training with 70K for the first 200K steps as well. Otherwise, you might want to try init_from_checkpoint to restore non-embedding variables only.

kmario23 commented 7 years ago

Hi @oahziur, Thanks for your insights!

The 55K vocabulary can be considered as a subset of 75K. But, all of 75K vocab is not available at a single instance. The new sentence pairs come as and when available (usually once in a week). Now, I want to constantly adapt or (increase) the vocabulary size by N as new samples come in. Think of it as constantly improving a baseline model :)

In such a scenario, what would be the ideal way to use the latest checkpoints for further training? Should the vocabulary size be increased at all? (I'm training subword-units based NMT). What other hyperparameters should be changed for achieving good adaptation? Thanks!

Sabyasachi18 commented 6 years ago

Hi @oahziur .. I have a question. I want to run incremental training on my trained German-English NMT Engine with subword BPE encodng. Can I update my vocab file with new words from the incremental training data. If Yes, then kindly let me know the process. With regards to this question of changing vocabulary size, how do I use init_from_checkpoint to achieve this?

Should I append the new words at the end of the existing vocabulary file while running incremental training? Or should i do a sorting of the vocab file after appending the new words to it?