Closed pdakwal closed 1 year ago
You can directly use the huggingface
model (without going through) and finetune it inside pytorch or fast.ai:
If you choose any of the options above, you will need to slightly modify the data processing and modeling since you are interested by a different task/model.
Once your model has a satisfactory BLEU score, you can use it in dl-translate
by specifying the path to your model
@pdakwal were u able to finetune mbart50? Can you please help?
@nikhiljaiswal Huggingface released a code snippet that shows you how to do the forward pass: https://huggingface.co/docs/transformers/model_doc/mbart#training-of-mbart50
You can easily do the backward pass using the regular pytorch approach:
optimizer = optim.AdamW(model.parameters(), lr=0.0001)
#load input here
loss = model(**model_inputs, labels=labels).loss
loss.backward()
optimizer.step()
optimizer.zero_grad()
# continue here...
Huggingface now has a tutorial for finetuning: https://huggingface.co/docs/transformers/tasks/translation
It's already very abstracted, so there's not a lot of value in further abstracting it as it would actually hurt the flexibility of the training process (as a lot of decisions need to be taken).
Hi,
Is there any plan to also provide support for fine-tuning mbart models?