marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.22k stars 228 forks source link

how to do transfer learning with marian #348

Closed q2044757581 closed 3 years ago

q2044757581 commented 3 years ago

assuming that i have trained a model with a large dataset, than i want to finetune the model with some specific dataset how to load the pretrained model's weight before finetuning?

q2044757581 commented 3 years ago

can i add some parameters like marian -c config.yml --pretrained_model xxx.npz??

snukky commented 3 years ago

To initialize graph parameters from another model, simply use --pretrained-model your-previously-trained-model.npz in a training command. Parameters from each matrix which name matches with a matrix in the current architecture, will be copied. You can use any options as this will start a new training, including a new start of optimizer.

If you want to preserve optimizer parameters from the previous training and only change the dataset, add --no-restore-corpus --valid-reset-stalled to your training command and provide new training data to --train-sets. Perhaps also decrease the learning rate and validation step if the fine-tuning data is smaller. Options from the previous run can be overwritten by simply setting new values.