Open kmario23 opened 7 years ago
You can change the train_prefix but keep the out_dir argument. Technically, it should work (or would require minimal changes to get things to work). I would recommend using subword units (BPE) if you haven't to present unseen words. Also, make sure you back up the checkpoints before trying :)
See section 3 here for NMT adaptation https://nlp.stanford.edu/~lmthang/data/papers/iwslt15.pdf.
@lmthang Thank you very much for your insights! Yes, we're trying to use subword units.
Since we want to update the vocabulary files
in accordance with new training samples, this also involves changing vocab_prefix
, right? If yes, how does the tensor size of old parameters (in the checkpoint) agree with new vocab size? :)
@kmario23 Did you found any way to update vocabulary for new data samples?
@ssokhey Same question here!
I have trained a seq2seq NMT model (EN-DE) with 1M samples and saved the latest checkpoint. Now, I have some domain-specific data of 50K sentence pairs which has not been seen in previous training data. How can I adapt the current model to this new data?
Specifically, I'd like to
finetune
the model to the new domain and as I increase the number of samples in that domain, the model should output reasonably well translation for test sentences in that domain. Thefinetuning
is commonly done in computer vision but I'm not sure how to achieve this inseq2seq
architecture.I'm aware of the fact that the
vocabulary
files for both languages has to be updated according to the new sentence pairs. But, to achieve this, do we have to again start training from scratch? Isn't there a smarter way to continue training from the current checkpoint after dynamically updating the necessary components?Any ideas or relevant papers which address this issue?