Closed ekdnam closed 3 years ago
We do not use custom model classes or custom optimizers for training with fairseq, so you could try using the fairseq's transformer model or optimizer classes directly.
Our translation model just uses a different config (different embedding dimension, attention heads compared to transformer_large architechture ).
For loading the model with the python interface for inference (note this cannot be used for training as python wrapper class is written for inference only), you can follow the tutorial here.
If you want to use the checkpoint and finetune the model further on custom data: you can follow the notebook tutorial here.
fairseq-train ../dataset/final_bin \
--max-source-positions=210 \
--max-target-positions=210 \
--max-update=1000 \
--save-interval=1 \
--arch=transformer_4x \
--criterion=label_smoothed_cross_entropy \
--source-lang=SRC \
--lr-scheduler=inverse_sqrt \
--target-lang=TGT \
--label-smoothing=0.1 \
--optimizer adam \
--adam-betas "(0.9, 0.98)" \
--clip-norm 1.0 \
--warmup-init-lr 1e-07 \
--warmup-updates 4000 \
--dropout 0.2 \
--tensorboard-logdir ../dataset/tensorboard-wandb \
--save-dir ../dataset/model \
--keep-last-epochs 5 \
--patience 5 \
--skip-invalid-size-inputs-valid-test \
--fp16 \
--user-dir model_configs \
--update-freq=2 \
--distributed-world-size 1 \
--max-tokens 256 \
--lr 3e-5 \
--restore-file ../en-indic/model/checkpoint_best.pt \
--reset-lr-scheduler \
--reset-meters \
--reset-dataloader \
--reset-optimizer
^ note that the final line --reset-optimizer
will reset the optimizer states, if you want to reuse the optimizer for further training, do not pass this and also set the other reset flags (wrt lr scheduler, etc) accordingly.
Hi.
I am trying to load the en-indic model into PyTorch.
After unzipping the folder, I am doing
Now, to load the model and optimizer for later use, I am following this.
To follow along with this tutorial, what is the
TheModelClass
andTheOptimizerClass
?