bigscience-workshop / multilingual-modeling

BLOOM+1: Adapting BLOOM model to support a new unseen language
https://arxiv.org/abs/2212.09535
Apache License 2.0
69 stars 15 forks source link

add last-layer finetuning for tasks #32

Closed yongzx closed 2 years ago

yongzx commented 2 years ago
yongzx commented 2 years ago

Further commits:

33 - remove a line of assert False.

e1079c1 - use correct Trainer class, and use model.freeze_model(freeze=True) to freeze the base model instead of using for-loop to freeze base model. The reason is that the latter would break if we use BERT model that has "bert" prefix instead of "transformer" prefix.