xhluca / dl-translate

Library for translating between 200 languages. Built on 🤗 transformers.
https://xhluca.github.io/dl-translate/
MIT License
451 stars 47 forks source link

Support for fine-tuning huggingface/mbart models #38

Closed pdakwal closed 1 year ago

pdakwal commented 3 years ago

Hi,

Is there any plan to also provide support for fine-tuning mbart models?

xhluca commented 3 years ago

You can directly use the huggingface model (without going through) and finetune it inside pytorch or fast.ai:

If you choose any of the options above, you will need to slightly modify the data processing and modeling since you are interested by a different task/model.

Once your model has a satisfactory BLEU score, you can use it in dl-translate by specifying the path to your model

nikhiljaiswal commented 2 years ago

@pdakwal were u able to finetune mbart50? Can you please help?

xhluca commented 2 years ago

@nikhiljaiswal Huggingface released a code snippet that shows you how to do the forward pass: https://huggingface.co/docs/transformers/model_doc/mbart#training-of-mbart50

You can easily do the backward pass using the regular pytorch approach:

optimizer = optim.AdamW(model.parameters(), lr=0.0001)

#load input here
loss = model(**model_inputs, labels=labels).loss
loss.backward()
optimizer.step()
optimizer.zero_grad()

# continue here...
xhluca commented 1 year ago

Huggingface now has a tutorial for finetuning: https://huggingface.co/docs/transformers/tasks/translation

It's already very abstracted, so there's not a lot of value in further abstracting it as it would actually hurt the flexibility of the training process (as a lot of decisions need to be taken).