bigscience-workshop / multilingual-modeling

BLOOM+1: Adapting BLOOM model to support a new unseen language
https://arxiv.org/abs/2212.09535
Apache License 2.0
69 stars 15 forks source link

Bitfit #22

Closed lintangsutawika closed 2 years ago

lintangsutawika commented 2 years ago

In madx_run_clm.py I added a new arg finetuning_strategies that can be extended to choose any other parameter-efficient finetuning methods.

yongzx commented 2 years ago

I will review this! Thanks!

In madx_run_clm.py I added a new arg finetuning_strategies that can be extended to choose any other parameter-efficient finetuning methods.

It's already in there: lang_adapt_strategies.

yongzx commented 2 years ago

@lintangsutawika Left some reviews. Everything else LGTM.

lintangsutawika commented 2 years ago

@yongzx Bitfit isn't really an adapter, it just freezes the biases of a given model. I'm not sure it fits in the modify_model.

I've placed it it line 625 currently.

        elif model_args.lang_adapt_strategies == "bitfit":
            for name, param in model.transformer.named_parameters():
                if 'bias' not in name:
                    param.requires_grad = False
yongzx commented 2 years ago

Thanks Lintang! I will run it today and merge it by EOD.