Support for fine-tuning huggingface/mbart models

xhluca / dl-translate

Library for translating between 200 languages. Built on 🤗 transformers.

https://xhluca.github.io/dl-translate/

MIT License

451 stars 47 forks source link

Support for fine-tuning huggingface/mbart models #38

Closed pdakwal closed 1 year ago

pdakwal commented 3 years ago

Hi,

Is there any plan to also provide support for fine-tuning mbart models?

xhluca commented 3 years ago

You can directly use the huggingface model (without going through) and finetune it inside pytorch or fast.ai:

Here's a tutorial showing how to finetune BART for summarization
Here's one that uses huggingface's trainer api
Here's a notebook showing how to finetune T5 in PyTorch

If you choose any of the options above, you will need to slightly modify the data processing and modeling since you are interested by a different task/model.

Once your model has a satisfactory BLEU score, you can use it in dl-translate by specifying the path to your model

nikhiljaiswal commented 2 years ago

@pdakwal were u able to finetune mbart50? Can you please help?

xhluca commented 2 years ago

@nikhiljaiswal Huggingface released a code snippet that shows you how to do the forward pass: https://huggingface.co/docs/transformers/model_doc/mbart#training-of-mbart50

You can easily do the backward pass using the regular pytorch approach:

optimizer = optim.AdamW(model.parameters(), lr=0.0001)

#load input here
loss = model(**model_inputs, labels=labels).loss
loss.backward()
optimizer.step()
optimizer.zero_grad()

# continue here...

xhluca commented 1 year ago

Huggingface now has a tutorial for finetuning: https://huggingface.co/docs/transformers/tasks/translation

It's already very abstracted, so there's not a lot of value in further abstracting it as it would actually hurt the flexibility of the training process (as a lot of decisions need to be taken).